mocdaniel / dashing-icinga2

Dashing dashboard for Icinga 2 using the REST API
MIT License
204 stars 47 forks source link

No information is shown until service is restarted #113

Closed log1-c closed 2 years ago

log1-c commented 4 years ago

I have setup dashing according to the docs (apart from not running bundle as root, because it only works as root) and get only some sample values. API user has the correct permission (even tried with "*"). image

The VERY strange thing: As soon as I do systemctl stop dashing-icinga2 or systemctl restart dashing-icinga2 the values are updated to the actual numbers. image

looking at the log file of the dashing thin server also shows that data is pulled:

zones {"zone1"=>{"client_log_lag"=>0.0, "connected"=>true, "endpoints"=>["SRVzone1MONSAT"], "parent_zone"=>"master"}, "zone2"=>{"client_log_lag"=>0.0, "connected"=>true, "endpoints"=>["srvzone2monsat"], "parent_zone"=>"master"}, "master"=>{"client_log_lag"=>0.0, "connected"=>true, "endpoints"=>["srvzonemastermonmaster"], "parent_zone"=>""}, "zone3"=>{"client_log_lag"=>0.0, "connected"=>true, "endpoints"=>["SRVzone3MONSAT"], "parent_zone"=>"master"}}
checker {"idle"=>150.0, "pending"=>0.0}
command 1.0
main-log 1.0
graphite {"connected"=>true, "work_queue_item_rate"=>2.466666666666667, "work_queue_items"=>0.0}
app {"enable_event_handlers"=>true, "enable_flapping"=>true, "enable_host_checks"=>true, "enable_notifications"=>false, "enable_perfdata"=>true, "enable_service_checks"=>true, "environment"=>"", "node_name"=>"srvzonemastermonmaster", "pid"=>26523.0, "program_start"=>1601457142.66971, "version"=>"r2.12.0-1"}
ido-mysql {"connected"=>true, "instance_name"=>"default", "query_queue_item_rate"=>7.416666666666667, "query_queue_items"=>0.0, "version"=>"1.14.3"}
notification 1.0
Stats: [{"label"=>"Host checks/min", "value"=>30.0}, {"label"=>"Service checks/min", "value"=>112.0}, {"label"=>"json_rpc queue rate", "value"=>"8.22"}, {"label"=>"graphite queue rate", "value"=>"2.47"}, {"label"=>"ido-mysql queue rate", "value"=>"7.42"}]
Severity: [{"label"=>"srvzone3aap - cpu", "color"=>"yellow", "state"=>1}, {"label"=>"srvzone2temp19 - services", "color"=>"purple", "state"=>3}, {"label"=>"zone2Fax - disk-c", "color"=>"purple", "state"=>3}, {"label"=>"srvzone2temp19 - swap", "color"=>"purple", "state"=>3}, {"label"=>"srvzone2temp19 - memory", "color"=>"purple", "state"=>3}, {"label"=>"srvsql - services", "color"=>"purple", "state"=>3}, {"label"=>"srvzonemasterdocuware1 - memory", "color"=>"purple", "state"=>3}, {"label"=>"srvzone1hv2 - services", "color"=>"purple", "state"=>3}, {"label"=>"zone2SW-A04 - memory", "color"=>"purple", "state"=>3}, {"label"=>"zone2File - disk-d", "color"=>"purple", "state"=>3}, {"label"=>"srvzone1opc - swap", "color"=>"purple", "state"=>3}, {"label"=>"srvzone2usv1 - hardware-health", "color"=>"purple", "state"=>3}, {"label"=>"srvzonemasteropc - disk-c", "color"=>"purple", "state"=>3}, {"label"=>"cisco1 - hardware-health", "color"=>"purple", "state"=>3}, {"label"=>"srvzonemastersql - swap", "color"=>"purple", "state"=>3}, {"label"=>"zone2Spl - swap", "color"=>"purple", "state"=>3}, {"label"=>"srvzone1opc - memory", "color"=>"purple", "state"=>3}, {"label"=>"srvzonemastersql - memory", "color"=>"purple", "state"=>3}, {"label"=>"zone2SW-A04 - cpu", "color"=>"purple", "state"=>3}, {"label"=>"zone2DB - disk-d", "color"=>"purple", "state"=>3}]

Doesn't matter if I use v.2.0.0 or v3.0.0 Strange thing, bc only this server is affected.

A different server of a different customer (running onprem instead of azure) works without problems.

versions: icinga2 is on r2.12.0-1 OS is Ubuntu 18.04.5 running in Azure tested with FF & new EDGE

I implemented a dirty workaround restartung the dashing every morning, but that doesn't help. If you open the dashboard after the restart (new tab, new browser) only the sampel (or what ever) data is displayed.

mocdaniel commented 4 years ago

I tried replicating this issue but could not do so, using the same OS, browsers and Icinga2-Version. Anyways, if you indeed used v3.0.0, I would advise you to try again with the current Masterbranch-version. We changed some stuff regarding the automatic updates of widget states and this missing update in v3.0.0 might fiddle with your dashboard at the moment.

Also, bundling the application in a non-root context worked just fine for me, all you need to do is to give write-permissions to the executing user: sudo chown -R user:user /usr/share/dashing-icinga2

Since all your networking seems to be alright and you recieve data from Icinga2, this is the only idea I can come up with without looking into the issue any further at the moment. If this does not suffice, I will look into your issue again.

log1-c commented 4 years ago

checked out the master, same behavior. will try a clean install of all dashing related things tomorrow and give feedback once more.

log1-c commented 4 years ago

So, as promised, I did a fresh clean install. Here is all I did:

As root:
  apt purge ruby bundler nodejs
  cd ..
  rm -rf dashing-icinga2/
  git clone https://github.com/mocdaniel/dashing-icinga2.git
  apt-get -y install ruby bundler nodejs
  gem install bundler
  chown user:user dashing-icinga2/ -R
as user:
  bundle
  chmod o+x /var/lib/gems/ -R
  sudo chmod o+x /var/lib/gems/ -R
  bundle
  cp config/icinga2.json config/icinga2.local.json
  vi config/icinga2.local.json
  git checkout master
  vi config/icinga2.local.json
  sudo cat /etc/icinga2/conf.d/api-users.conf
  vi config/icinga2.local.json
  sudo cp tools/systemd/dashing-icinga2.service /lib/systemd/system/
as root:
  systemctl daemon-reload
  systemctl start dashing-icinga2.service
  systemctl status dashing-icinga2.service

The first bundle complained about needing writing permissions to , so did chmod... Didn't help:

$ bundle
Using backports 3.18.2
Following files may not be writable, so sudo is needed:
  /usr/local/bin
  /var/lib/gems/2.5.0
  /var/lib/gems/2.5.0/build_info
  /var/lib/gems/2.5.0/cache
  /var/lib/gems/2.5.0/doc
  /var/lib/gems/2.5.0/extensions
  /var/lib/gems/2.5.0/gems
  /var/lib/gems/2.5.0/specifications
Using bundler 2.1.4
Using coffee-script-source 1.12.2
Using execjs 2.0.2
Using coffee-script 2.2.0
Using concurrent-ruby 1.1.7
Using daemons 1.3.1
Using rack 1.5.5
Using tzinfo 2.0.2
Using rufus-scheduler 2.0.24
Using sass 3.2.19
Using rack-protection 1.5.5
Using tilt 1.4.1
Using sinatra 1.4.8
Using multi_json 1.15.0
Using rack-test 0.7.0
Using sinatra-contrib 1.4.7
Using hike 1.2.3
Using sprockets 2.10.2
Using eventmachine 1.2.7
Using thin 1.6.4
Using thor 1.0.1
Using dashing 1.3.7
Using unf_ext 0.0.7.7
Using unf 0.1.4
Using domain_name 0.5.20190701
Using http-accept 1.7.0
Using http-cookie 1.0.3
Using json 2.3.1
Using mime-types-data 3.2020.0512
Using mime-types 3.3.1
Using netrc 0.11.0
Using rest-client 2.1.0
Bundle complete! 6 Gemfile dependencies, 33 gems now installed.
Use `bundle info [gemname]` to see where a bundled gem is installed.

Also bundledidn't ask for root/sudo by himself. Not sure if write access is needed, so I did not run bundle with sudo again

Outcome is the same, dashboard only updates once I did a restart of the service.

If you need any more information, please tell me 👍

Best regards logic

side note: <div data-view="Clock" data-title="<%=getTimeZone()%>" data-timezone="<%=getTimeZone()%>"></div> does not seem to work, because I always get UTC displayed.

side note2: here is what my dashboard config looks like:

<script type='text/javascript'>
$(function() {
  // These settings override the defaults set in application.coffee. You can do this on a per dashboard basis.
  Dashing.widget_margins = [5,5]
  Dashing.widget_base_dimensions = [300,300]
  //Experimental: widget size based on window size.
  //Dashing.widget_base_dimensions = [$( window ).width()/5, $( window ).height()/2.2];
  Dashing.numColumns = 5
});
</script>

<% content_for :title do %>Icinga 2<% end %>
<div class="gridster">
  <ul>
    <!-- Statistics -->
    <li data-row="1" data-col="1" data-sizex="1" data-sizey="1">
      <div data-view="Clock" data-title="<%=getTimeZone()%>" data-timezone="<%=getTimeZone()%>"></div>
    </li>

    <li data-row="1" data-col="2" data-sizex="1" data-sizey="1">
      <div
        data-id="doughnut-pie-hosts"
        data-view="Chartjs"
        data-type="doughnut"
        data-title="Hosts"
        data-labels="Up,Down"
        data-colornames="green,red"
        data-datasets="20,13"
        data-height="300"
        data-width="300"
      ></div>
    </li>
    <li data-row="1" data-col="3" data-sizex="1" data-sizey="1">
      <div
        data-id="doughnut-pie-services"
        data-view="Chartjs"
        data-type="doughnut"
        data-title="Services"
        data-labels="OK,Warning,Critical,Unknown"
        data-colornames="green,yellow,red,purple"
        data-datasets="20,13,12,0"
        data-height="300"
        data-width="300"
      ></div>
    </li>
    <li data-row="1" data-col="4" data-sizex="1" data-sizey="1">
      <div
        data-id="bar-chart-endpoints"
        data-view="Chartjs"
        data-type="bar"
        data-header="Endpoints"
        data-title="Endpoints"
        data-labels="Connected,Not Connected"
        data-colornames="green,red"
        data-datasets="42,404"
        data-height="300"
        data-width="300"
      ></div>
    </li>

    <li data-row="3" data-col="1" data-sizex="1" data-sizey="1">
      <div data-id="icinga-stats" data-view="List" data-unordered="true" data-title="Statistics"></div>
    </li>

    <li data-row="3" data-col="2" data-sizex="1" data-sizey="1">
      <div
        data-id="bar-chart-checks"
        data-view="Chartjs"
        data-type="horizontalBar"
        data-header="Active Checks"
        data-title="Active Checks"
        data-labels="Hosts/min,Services/min"
        data-colornames="aqua,lime"
        data-datasets="42,404"
        data-height="300"
        data-width="300"
      ></div>
    </li>
<!--
    <li data-row="2" data-col="3" data-sizex="1" data-sizey="1">
      <div
        data-id="bar-chart-downtimes"
        data-view="Chartjs"
        data-type="horizontalBar"
        data-header="Downtimes"
        data-title="Downtimes"
        data-labels="Hosts,Services"
        data-colornames="blue,green"
        data-datasets="42,404"
        data-height="300"
        data-width="300"
      ></div>
    </li>

    <li data-row="2" data-col="4" data-sizex="1" data-sizey="1">
      <div
        data-id="bar-chart-acks"
        data-view="Chartjs"
        data-type="horizontalBar"
        data-header="Acknowledgements"
        data-title="Acknowledgements"
        data-labels="Hosts,Services"
        data-colornames="blue,green"
        data-datasets="42,404"
        data-height="300"
        data-width="300"
      ></div>
    </li>
-->
    <!-- Problems -->
    <li data-row="2" data-col="2" data-sizex="1" data-sizey="1">
      <div data-id="icinga-host-meter" data-view="Meter" data-title="Host Problems" data-min="0" data-max="100"></div>
    </li>
    <li data-row="2" data-col="3" data-sizex="1" data-sizey="1">
      <div data-id="icinga-service-meter" data-view="Meter" data-title="Service Problems" data-min="0" data-max="100"></div>
    </li>

    <!-- Handled -->
    <!--
    <li data-row="2" data-col="1" data-sizex="1" data-sizey="1">
      <div data-id="handled-stats" data-view="List" data-unordered="true" data-title="Handled"></div>
    </li>
    -->
    <!-- Unhandled Host and Service Problems -->
    <li data-row="2" data-col="1" data-sizex="1" data-sizey="1">
      <div data-id="icinga-host-problems" data-view="Simplelist" data-title="Unhandled Hosts"></div>
    </li>
    <li data-row="2" data-col="4" data-sizex="1" data-sizey="1">
      <div data-id="icinga-service-problems" data-view="Simplelist" data-title="Unhandled Services"></div>
    </li>

    <!-- Takes two rows for all service problems by severity -->
    <li data-row="1" data-col="5" data-sizex="1" data-sizey="3">
      <div class="scrollable" data-id="icinga-severity" data-view="List" data-unordered="true" data-title="Problems"></div>
    </li>
  </ul>

</div>

I also removed the comments in jobs/icinga2.rb for the unhandled stuff. But the meters still stay emtpy: image

mocdaniel commented 4 years ago

I replicated your commands again, with the same results - my setup works fine, including the time zone, did you enter a valid timezone other than UTC in config/icinga2.local.json? Timezones can be passed there in the format "Europe/Berlin".

Here is a somewhat "tidier" complete reinstall of dashing-icinga2, can you set it up from zero once more following this? If this does not work, I will need:

    # as root:
    apt purge ruby bundler nodejs
    apt autoremove
    rm -rf /var/lib/gems/2.5.0 # only if you don't need ruby for other things running on this instance
    rm -rf dashing-icinga2
    git clone https://github.com/mocdaniel/dashing-icinga2.git
    apt-get -y install ruby nodejs bundler
    gem install bundler
    cd dashing-icinga2
    chown USERNAME:USERNAME -R .

    # as USERNAME:
    cd [...]/dashing-icinga2
    bundle # IGNORE ALL WARNINGS REGARDING SUDO
    cp config/icinga2.json config/icinga2.local.json
    edit icinga2.local.json

    # as root:
    cp tools/systemd/dashing-icinga2.service /lib/systemd/system/dashing-icinga2.service
    systemctl daemon-reload

    # you might face an error page displaying a stacktrace in your browser
    # with this procedure, if this is the case execute
    chown USERNAME:USERNAME /tmp/dashing-icinga2.log

I really hope we get your instance to run, it looks like a really odd error to me :/

log1-c commented 4 years ago

Ok, here goes. Hopefully I didn't miss anything. Worked with what you posted above, sadly the outcome didn't change.

After the installation I opened the browser and again only saw the sample values. At around 14:00:00 the cronjob I added as a workaround restarted the service, which made the correct values appear. After this I changed the dashboard/icinga2.erb & jobs/icinga2.rb (removed comments from 109-119) to what I have attached. And then restarted the service by hand at around 14:05:50.

journalctl -u dashing-icinga2.service.log console output.log thin.log dashing-icinga2.log icinga2.rb.txt icinga2.erb.txt

As to the time zone: No, I didn't put anything else in the config than UTC. I assumed that the getTimeZone() would get the timezone from the browser, not the config file. As this is not the case UTC is the best option for this scenario.

mocdaniel commented 4 years ago

Thank you very much. The first very odd think that caught my eye is that the Licensing and Copyright section in your version of lib/icinga2.rb seems to be an older one, along with much of the containing code in this file. Also, the arrangement of your widgets matches a layout we used to have but changed in the past.

So the problem might actually be that git gets you some outdated version of the repository, for a reason I do not know. I am really sorry for sounding like a broken record, but have you tried downloading the repository as a tarball and importing it to your server? This way you should definitely get the newest version.

log1-c commented 4 years ago

Just to be clear that we are not talking about different files: The icinga2.rb file is from the jobs folder, not lib!

# git status
On branch master
Your branch is up to date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   dashboards/icinga2.erb
        modified:   jobs/icinga2.rb

This indicates to me that I have the correct files from the current master branch.

My last suspicion is that it has something to do with the VM running in Azure and beeing a "Ubuntu by Azure"

mocdaniel commented 4 years ago

Thank you for pointing this out to me, totally my bad.

My last suspicion is that it has something to do with the VM running in Azure and beeing a "Ubuntu by Azure"

I came to the same conclusion, these hosting solutions tend to be a bit messy in some regards, especially when it comes to firewall and connectivity issues, see #107 where we couldn't resolve issues with Azure infrastructure as well. If this was the case you would have to try and resolve the issue on your end, I am afraid.

One last thing you could do would be to try out the docker image provided by dnsmichi, since you won't have to bundle or configure things yourself there. Might be worth a shot to verify if it is a configuration/dependency issue or rather one with your hosting infrastructure.

Setup/Run-instructions for dockerized dashing-icinga2 can be found in the Readme.md.

log1-c commented 4 years ago

Hi @mocdaniel finally had the time to test around again.

Just installed/ran dashing on the Azure Ubuntu VM via docker. Sadly with the same result. Interface shows sample values despite the CLI/log showing that the container is pulling information via the API. As soon as I CTRL+C'd the container the webinterface updated with the real values.

Also did another test on the Ubuntu VM that is not hosted in Azure and (a bit surprising) got the same result: Sample value until I restart dashing....

I'm out of ideas. If you don't have any other ideas I can test, feel free to close this issue 🤷 :)

mocdaniel commented 4 years ago

Hi @log1-c I am sorry to hear this. At the moment, I don't have another idea nor the time to look into your issue any further, but I will leave this issue opened and return to it at another time.

log1-c commented 3 years ago

Have found the reason why there where sample values displayed: nearly each "dashlet" in the icinga2.rb file has a data-set=values option, that is responsible for that. This is a simple as it is easy to overlook.

Having found this I was happy and removed the lines, but it didn't help. Now there are no values displayed until the service is restarted: image

The problem also occured with a local testing VM with Ubuntu 20.04, so I think we can rule Azure out as the culprit. Will play around with the test VM some more and report if I find something worth mentioning.

mocdaniel commented 3 years ago

This keeps bugging me. I just set up a local Ubuntu 20.04 VM myself and everything works just fine out of the box. The data-datasets values in the .erb dashboard files are there so people can get a first impression of the dashboard's capabilities without having to connect to a populated Icinga instance. Only thing I noticed are some deprecation messages when starting up dashing, but your problems have been observed before I first noticed those deprecation messages so I think we can ignore them for this issue...

Feel free to hit me up with further details and the results of the tests you'll run, as I have no idea how to fix this at the moment :/

dnsmichi commented 3 years ago

Really recommend using the Docker image and compare your results. Source installations and OS dependencies are often the root cause for problems like this.

log1-c commented 3 years ago

I tried the Docker way already and it yielded the same results, sadly. I'm currently "simply" restarting the dashing service every 5min via cron, so the dashboard is most likely up-to-date when to customer opens it. As I haven't heard anything negative in the past 2 months this (admittedly dirty) workaround seems to work.

In case there is some negative feedback from the customer I will fiddle around with this some more, until then I will let the workaround do it's thing :)

danjford commented 2 years ago

Hi just installed dashing-icinga2 on ubuntu and also had this problem of a completely empty dashboard but have found a fix thanks to https://github.com/Smashing/smashing/issues/112 where it suggests to use puma rather than thin, so maybe the problem is with thin?

To fix I:

Now the dashboard populates!

Hope this works for you!

mocdaniel commented 2 years ago

Thank you very much, I will look into this asap! thin definitely got its querks, so evaluating whether puma is the better choice to go with for the future doesn't hurt, anyways. :)

log1-c commented 2 years ago

The customer where this happened moved to a different monitoring system, so I can't test anything at the moment :/ ^^ But it's nice to know that there is a solution, in case I stumble upon this again :)

Thanks for the feedback and your work!