pimoroni / enviro

MIT License
105 stars 85 forks source link

Allow for more verbose logging #68

Open MrDrem opened 2 years ago

MrDrem commented 2 years ago

When I've been trying to troubleshoot issues, I've not found the log files to be all that helpful. It would be nice to have an option for more verbose logging to try and better work out where things are going wrong when they are.

lowfatcode commented 2 years ago

In general we're open to this but perhaps it's better to add logging on a case by case basis when issues are found instead of just peppering it throughout the whole system.

Do you have a specific issue that was hard to pin down that could have been identified more easily with extra logging?

MrDrem commented 2 years ago

I've had a few times where the system has stopped responding (white light stuck on) which I have suspected is either a sensor fail (although I suspect that I may have been wrong now) or that the upload to io.adafruit.com has got stuck (I suspect this honestly). This is part of the issue of being on the bleeding edge ;)

In almost every case I've not been able to work out exactly what has caused the issue. I haven't kept all of the log files that I've pulled sadly, so it's hard to show. The last log file I have 2022-09-05 log.txt the error was that the rain.txt file was missing, which I realised, but the log file wouldn't have helped with much I don't think.

Maybe it could be done as a different fork of the code that is specifically for issue hunting, so that most users never see it?

MrDrem commented 2 years ago

As you can see from this log file, 2022-09-05a log.txt, the device stopped at 13:15, but there is no sign of why.

MrDrem commented 2 years ago

It once again stopped at 22:40:02, but I'm not sure why, and again the log file isn't much help.

This was during the thunderstorms last night, but I didn't have the rain sensor in, as it stops the wake alarms (see issue #70)

lowfatcode commented 2 years ago

I have modules here running for many days without issue so I don't know how to recreate the issue you're having with it (crashing?) in the night. We are discussing whether we could run some sort of watchdog on the second core that resets the device if it doesn't respond for a period of time.

MrDrem commented 2 years ago

It seems stable when I've not got sensors connected. My sensors are rather old (checking @mitchamweather on twitter, I got them at the end of October 2013) but they do seem to be working fine, and normally reporting readings as I would expect (excepting for issues I'm logging here)

If I look at this crash, it wakes, turns on the LED, then stops, it doesn't even get to checking for files to upload. Which I guess gives somewhere to look in the code at.

Looking at it again, I spy "2022-09-05 16:32:47 [info / 87kB] - uploaded 2022-09-05T16:32:41Z.json to adafruit_io" which suggests that there wasn't a readings file created at 13:15, which may narrow the range down a bit.

I've forked this, and have added some simple info lines to confirm what has been done into main.py.

2022-09-05 13:15:02 [debug    / 116kB] > performing startup
2022-09-05 13:15:02 [info     / 114kB]   - wake reason: unknown
2022-09-05 13:15:02 [debug    / 112kB]   - turn on activity led
2022-09-05 16:32:39 [debug    / 116kB] > performing startup
2022-09-05 16:32:39 [info     / 114kB]   - wake reason: unknown
2022-09-05 16:32:39 [debug    / 112kB]   - turn on activity led
2022-09-05 16:32:41 [info     /  99kB] > 1 cache files need uploading
2022-09-05 16:32:41 [info     /  97kB] > connecting to wifi network 'Celestia'
2022-09-05 16:32:43 [debug    / 122kB]   - connecting
2022-09-05 16:32:46 [debug    / 108kB]   - connecting
2022-09-05 16:32:46 [info     / 105kB]   - ip address:  192.168.1.208
2022-09-05 16:32:47 [info     /  87kB]   - uploaded 2022-09-05T16:32:41Z.json to adafruit_io
2022-09-05 16:32:47 [info     /  84kB] > going to sleep
2022-09-05 16:32:47 [debug    / 122kB]   - clearing and disabling timer and alarm
2022-09-05 16:32:47 [info     / 120kB]   - setting alarm to wake at 16:35pm
2022-09-05 16:32:47 [info     / 118kB]   - shutting down
MrDrem commented 2 years ago

With my additional logging, the above has now become:

2022-09-06 23:35:02 [debug    / 114kB] > performing startup
2022-09-06 23:35:02 [info     / 123kB]   - wake reason: unknown
2022-09-06 23:35:02 [debug    / 122kB]   - turn on activity led
2022-09-06 23:35:02 [info     / 120kB]   - config loaded
2022-09-06 23:35:02 [info     / 118kB]   - clock already set
2022-09-06 23:35:02 [info     / 115kB]   - disk space OK
2022-09-06 23:35:02 [info     / 112kB]   - wind speed started

which looks like this when it's working corrrectly:

2022-09-06 23:30:02 [debug    / 114kB] > performing startup
2022-09-06 23:30:02 [info     / 123kB]   - wake reason: unknown
2022-09-06 23:30:02 [debug    / 122kB]   - turn on activity led
2022-09-06 23:30:02 [info     / 120kB]   - config loaded
2022-09-06 23:30:02 [info     / 118kB]   - clock already set
2022-09-06 23:30:02 [info     / 116kB]   - disk space OK
2022-09-06 23:30:02 [info     / 113kB]   - wind speed started
2022-09-06 23:30:03 [info     / 107kB]   - Not enough wind ticks captured
2022-09-06 23:30:03 [info     / 100kB]   - wind direction captured
2022-09-06 23:30:03 [info     /  98kB]   - sensor readings OK
2022-09-06 23:30:04 [info     /  95kB]   - cached OK
2022-09-06 23:30:04 [info     /  92kB] > 1 cache files need uploading
2022-09-06 23:30:04 [info     /  90kB] > connecting to wifi network 'Celestia'
2022-09-06 23:30:06 [debug    / 118kB]   - connecting
2022-09-06 23:30:09 [debug    / 105kB]   - connecting
2022-09-06 23:30:11 [info     /  95kB]   - ip address:  192.168.1.208
2022-09-06 23:30:12 [info     / 114kB]   - uploaded 2022-09-06T23:30:03Z.json to adafruit_io
2022-09-06 23:30:12 [info     / 112kB] > going to sleep
2022-09-06 23:30:12 [debug    / 110kB]   - clearing and disabling timer and alarm
2022-09-06 23:30:12 [info     / 107kB]   - setting alarm to wake at 23:35pm
2022-09-06 23:30:12 [info     / 105kB]   - shutting down

I'm adding more in slowly to try and work out where in the wind speed monitoring I'm having issues. I'll open a new log if/when I work out what is going on.

ZodiusInfuser commented 1 year ago

@MrDrem Do you have a list of the places where we should include more logging. There is a trade-off with more logging of course, but in my own look at this code I have benefitted from more, so agree that it should be an option (perhaps in the config.py?).

MrDrem commented 1 year ago

I've been adding in additional logging in my personal fork - https://github.com/MrDrem/enviro.git

Generally I've tried to add it where there is an action happening, so for example on the wind speed sensor I've added a log every tick, added a log entry for there not being enough ticks, and a confirmation that the wind speed reading process has been completed.

I think that this may be best as hashed comments, where the hash can be removed within the code to enable the additional logging lines. This means that most users never see it or the extra space, but if you want to enable it, it's easy to do, and can be easily targeted to the area where you think the issue might be.

ZodiusInfuser commented 1 year ago

@MrDrem There is more logging in the patch branch now. I am not sure it makes sense to include all of the ones from your fork, so let me know if there are any you think are essential and I'll consider adding them

MrDrem commented 1 year ago

I'll try to take a clone of the patch branch tonight, and see if I think there is anywhere else where logging might be good.

I suspect that I'll find that I like very verbose logging, which might be good to put in (space allowing) as commented out lines so that people can easily up the logging level as they require.

ZodiusInfuser commented 1 year ago

Try doing a fresh clone that pulls in phew too. I've just raised a pull request over there https://github.com/pimoroni/phew/pull/17 (and linked the patch enviro branch to it) that lets you set which of the (currently) five logging messages are actually logged. This does mean that a lot of the enviro logs are mislabelled, so some tidy up will be needed there. Some additional log types may be handy too, though they can't be specific to enviro.

ZodiusInfuser commented 1 year ago

There is a new release that adds more logging and control over logging: https://github.com/pimoroni/enviro/releases/tag/v0.0.9 I would welcome you looking over it and seeing what logging would still be of benefit.