testdasi / grafana-unraid-stack

Meet Gus! He has everything you need to start monitoring Unraid (Grafana - Influxdb - Telegraf - Loki - Promtail).
GNU General Public License v3.0
83 stars 18 forks source link

influxdb and telegraf crashing after 1/27/23 update #12

Open m-bongio opened 1 year ago

m-bongio commented 1 year ago

Good morning,

Overall, huge shout out and thank you for creating this...I love the visual view into how the server to doing, and how easy this container made it to setup. After updating this morning, I noticed that it isn't displaying any data (it was fine moments before I updated it), and then noticed that it appears that influxdb and telegraf crashing. Any suggestions on how to fix this? Below is the log.

text error warn system array login

[info] Initialisation started... [info] influxdb fixed. [info] loki fixed. [info] telegraf fixed. [info] promtail fixed. [info] grafana fixed. [info] Initialisation complete

[info] Runing apps... [info] Run influxdb as service on port 8086 Executable /usr/bin/influxd does not exist! [info] Run loki as daemon on port 3100 [info] Run telegraf as service [info] Run promtail as daemon on port 9086 [info] Run grafana as service on port 3006

[error] influxdb crashed! [info] loki PID: 60 [info] Skip hddtemp due to USE_HDDTEMP set to no [error] telegraf crashed! [info] promtail PID: 75 [info] grafana PID: 91

[info] Initialisation started... [info] influxdb fixed. [info] loki fixed. [info] telegraf fixed. [info] promtail fixed. [info] grafana fixed. [info] Initialisation complete

[info] Runing apps... [info] Run influxdb as service on port 8086 Executable /usr/bin/influxd does not exist! [info] Run loki as daemon on port 3100 [info] Run telegraf as service [info] Run promtail as daemon on port 9086 [info] Run grafana as service on port 3006

[error] influxdb crashed! [info] loki PID: 60 [info] Skip hddtemp due to USEHDDTEMP set to no [error] telegraf crashed! [info] promtail PID: 81 [info] grafana PID: 104

rhcp011235 commented 1 year ago

Getting the same here

skyn3t1337 commented 1 year ago

Seems to be a general problem. Will there be a fix? Thanks and best regards!

ZoXx commented 1 year ago

Same problem here. Hope, that there will be soon a fix?

m-bongio commented 1 year ago

Looks like there was another update this morning, but influxes and telegraf still crashed. I just rolled back to testdasi/grafana-unraid-stack:s230122 which is working fine for me.

juan11perez commented 1 year ago

I report the same problem

P6g9YHK6 commented 1 year ago

Same here...

I suppose that it is time to build the stack myself out of all the components...

SebaGnich commented 1 year ago

Same here...

I suppose that it is time to build the stack myself out of all the components...

Don't think that will help you really, because then you need to keep track of all the version compatibility yourself. It's a bummer you can't really see a Changelog in Unraid for any updates (can you?)..

rhcp011235 commented 1 year ago

another update today and it still crashes

bubba925 commented 1 year ago

Same issue. Reverted to s230122 as previously mentioned and I am good to go now.

yorch commented 1 year ago

Tried latest from March 14 and still crashed. Reverting to s230122 did the trick.

bobbo489 commented 1 year ago

I put a PR up with the fix, the cert for InfluxData changed. It just needs a merge.

yorch commented 1 year ago

@bobbo489 can you please put the link to the PR?

bobbo489 commented 1 year ago

I updated it in the static-ubuntu package, since this one everything is marked in the deprecated folder.

https://github.com/testdasi/static-ubuntu/pull/1

fapo85 commented 1 year ago

same error with a clean install here.

[info] Run influxdb as service on port 8086 Executable /usr/bin/influxd does not exist!

aslcmowmaejfo commented 1 year ago

Reverting to s230122 work!

Flummi commented 1 year ago

where merge? :\

SebaGnich commented 1 year ago

I reverted to s230122 but after some time (and also clean install) again everything broke..

dlchamp commented 1 year ago

Adding to this. Not entirely sure when it broke. It's been a a week or so since I last looked at my dashboard, but last night I noticed it wouldn't connect and when I looked at the logs I saw the same error:

Tried multiple versions, all broken.

Clean installing the container didn't help either. Feels like data is corrupted maybe, but I really don't want to have to recreate 4 dashboards by doing a proper clean install so I'm going to attempt to restore a a backup from last week and try again.

Edit:
Restoring a previous backup did not help.

stefan-matic commented 5 months ago

Managed to get it up and running after deleting the entire appdata.

What I noticed is that there were a bunch of files for healthcheck-failure (more than 100.000 files). image

I couldn't even run ls inside the Grafana-Unraid-Stack directory.

Maybe these healthcheck files need to be periodically cleaned or refactored so that this doesn't happen in the future?

Edit: It seems to be creating them again every minute

image

probably related to PR#1 which needs to be merged