bellrichm / WeeWX-MQTTSubscribe

A WeeWX service and driver that receives data from MQTT.
GNU General Public License v3.0
52 stars 13 forks source link

Gaps in WeeWx data #184

Closed linuxkidd closed 10 months ago

linuxkidd commented 11 months ago

Hi, and thanks for the MQTTSubscribe plugin!

I've noticed that when my Weather Station is publishing to different MQTT topics ( Wind, TempHumPress, Solar, Rain ) .. with different frequencies for each ( Wind = 3 seconds, the rest 60 seconds, except Rain which will publish 60 seconds or when the rain sensor trips )... I often have gaps in my data, both in the database and in my graphs ( https://wx.linuxkidd.com as an example ).

My suspicion is that WeeWx only stores the data from some number of most recent received topics at the archive interval, and if, for example.. the TempHumPress value is beyond that count, it's rolled out of WeeWx queue and is recorded as NULL values. I think I can overcome this by pre-processing all the data into a single MQTT topic, but this seems wasteful of publishing all data every 3 seconds ( to maintain the real-time wind speed on my site ).

Is this a known issue? Is there a workaround for it?

Thanks!

bellrichm commented 11 months ago

Are you running MQTTSubscribe as a service or driver? What is your archival interval?

My guess is that your archival interval is ‘short’ enough that your ‘longer’ topics are not being picked up in an WeeWX interval. WeeWX ‘aggregates’ loop data into an archive data. I think the first step is to run WeeWX from the command line and see what is in the loop data and archive data. This will be written to the console.

Depending what this reveals, a couple options might be: 1) Increase your archive interval 2) Another optin might be to use MQTTSubscribe’d cache functionality I’m not a big fan of this functionality, but it does serve a purpose. -rich

linuxkidd commented 11 months ago

Hi Rich, I'm running MQTTSubscribe as a driver. The archive Interval is 5 minutes. So there should be 5 long interval topic updates on average between each archive interval, but there are 100 short interval updates ( minimum, Rain could be higher if there are frequent bucket tips ).

I do have expires_after = 120 on my topics. However, with a maximum of 60 second update interval, I wouldn't expect that to cause a problem.

I modified my topic publish logic to publish every topic every 3 seconds. This seems to have stopped the gaps, but I'd much rather fall back to the update interval I had before.

Thanks for checking into this! LK

linuxkidd commented 11 months ago

Thinking about the expires_after value, I guess it's conceivable a missed update causes the data to be more than 2 minutes old. I've adjusted that to 299, so the data will expire if it's almost 5 minutes old. I'm reverting my update intervals on my published topics now. Hopefully that'll take care of most of the issue.

Thanks again, LK

linuxkidd commented 11 months ago

Unfortunately, that didn't completely solve it... there are now gaps in the time from my last update, to now. Switching back to always sending full udpates.

bellrichm commented 11 months ago

I’m currently puzzled. It should work without having to use expires_after. MQTTSubscribe should create loop packets for each MQTT message received. WeeWX should then aggregate these into the archive record. I’ll look at the code. The following would be extremely helpful

  1. Set debug = 1 and start WeeWX intereactively.
  2. Capture the loop packet and archive record information from the console.
  3. Capture the log.
  4. Make sure the archive period is one with missing data.
  5. Attach the MQTTSubscribeDriver section of weewx.conf.

As always, double check that no sensitive information is included. rich

linuxkidd commented 11 months ago

Hi Rich, Thanks again for taking time with this. I've set debug = 1 and waiting for a problem interval. In the interim, attaching the MQTTSubscribeDriver section. ap.cr1000.MQTTSubscribeDriver.txt

Thanks!

linuxkidd commented 11 months ago

Well, that didn't take long.. ( thankfully ). Log timestamps:

weewx_ap_miss.b.log.gz

Note: this is the full log file from WeeWx for this config.

Thanks again! :)

bellrichm commented 11 months ago

You’ve got me stumped. I grepped your log for ‘TempHumPress’ and ‘final loop packet’. It looks like MQTTSubscribe is creating the loo packets. WeeWX should be aggregating these into an archive record. If your up for it, run WeeWX from the command line. It will write out the loop packets and archive records to the the terminal. Also capture the complete WeeWX log, not just MQTTSubscribe. I’m definitely fishing for information…. I run MQTTSubscribe as a driver. One topic is about every 2 seconds and the other is 20 to 30 seconds and I’m not seeing this…. rich

linuxkidd commented 11 months ago

Hi Rich, Apologies, I did include all output log lines from the WeeWx service, but I see now when running the process interactive, it outputs the LOOP lines, but didn't in the logs.

I am however concerned.. even with debug = 1 in the conf file, the console version of the output is -only- outputting the LOOP lines, not any logging from MQTT or my Theme, etc.

I'm running it only with the config file as a param, not including the --daemon option. Is there something I'm missing to get all the log lines on stdout or stderr?

Thanks again for your time looking into this!

bellrichm commented 11 months ago

No apologies are necessary. When running interactivity, loop packets and archive record are printed to the console (i believe stdout, but I’d have to verify that). All logging goes to the ‘usual’ place, no matter the debug level or invocation. So, you have to capture both ‘locations’. I thought your previous attached log only had MQTTSubscribe log info, not base WeeWX log messages nor other extensions… There are probably a lot of nuances I am misstating. Like with systemd you can get stdout in the ‘log’… rich Yes running something like weewxd weewx.conf is what I am asking for. Obviously depending on current directory and install method, certain paths may be required. Let me know anything else I can do.

linuxkidd commented 11 months ago

Hi Rich, I'm going through the data and hope to post something shortly. Thanks again, just wanted you to know I've not disappeared. :)

linuxkidd commented 11 months ago

Hi Rich, Thanks for your patience. I'm providing two log files -- one is the console stdout + stderr output, the other is what was sent to syslog ( or, in this particular instance, journalctl ). I'm also including two screenshots showing the start and end of a rather large gap ( Last non-null entry: 00:25 --> Next non-null entry: 03:25 on Oct 8th ).

When looking through the console output log, the following traceback occurs around 22:55:16 on Oct 7th:

Traceback (most recent call last):
  File "/usr/share/weewx/weewx/engine.py", line 214, in run
    self.dispatchEvent(weewx.Event(weewx.CHECK_LOOP, packet=packet))
  File "/usr/share/weewx/weewx/engine.py", line 245, in dispatchEvent
    callback(event)
  File "/usr/share/weewx/weewx/engine.py", line 634, in check_loop
    raise BreakLoop
weewx.engine.BreakLoop

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/weewx/weewx/engine.py", line 650, in post_loop
    self._catchup(self.engine.console.genArchiveRecords)
  File "/usr/share/weewx/weewx/engine.py", line 697, in _catchup
    for record in generator(lastgood_ts):
  File "/usr/share/weewx/user/MQTTSubscribe.py", line 1710, in genArchiveRecords
    raise NotImplementedError
NotImplementedError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/weewx/weedb/mysql.py", line 54, in guarded_fn
    return fn(*args, **kwargs)
  File "/usr/share/weewx/weedb/mysql.py", line 266, in execute
    self.cursor.execute(mysql_string, tuple(sql_tuple))
  File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 170, in execute
    result = self._query(query)
  File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 328, in _query
    conn.query(q)
  File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 517, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 732, in _read_query_result
    result.read()
  File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1075, in read
    first_packet = self.connection._read_packet()
  File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 684, in _read_packet
    packet.check_error()
  File "/usr/lib/python3/dist-packages/pymysql/protocol.py", line 220, in check_error
    err.raise_mysql_exception(self._data)
  File "/usr/lib/python3/dist-packages/pymysql/err.py", line 109, in raise_mysql_exception
    raise errorclass(errno, errval)
pymysql.err.OperationalError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/weewx/weewxd", line 249, in <module>
    main()
  File "/usr/share/weewx/weewxd", line 154, in main
    engine.run()
  File "/usr/share/weewx/weewx/engine.py", line 221, in run
    self.dispatchEvent(weewx.Event(weewx.POST_LOOP))
  File "/usr/share/weewx/weewx/engine.py", line 245, in dispatchEvent
    callback(event)
  File "/usr/share/weewx/weewx/engine.py", line 652, in post_loop
    self._software_catchup()
  File "/usr/share/weewx/weewx/engine.py", line 715, in _software_catchup
    self.engine.dispatchEvent(weewx.Event(weewx.NEW_ARCHIVE_RECORD,
  File "/usr/share/weewx/weewx/engine.py", line 245, in dispatchEvent
    callback(event)
  File "/usr/share/weewx/weewx/engine.py", line 675, in new_archive_record
    dbmanager.addRecord(event.record,
  File "/usr/share/weewx/weewx/manager.py", line 403, in addRecord
    self._updateHiLo(accumulator, cursor)
  File "/usr/share/weewx/weewx/manager.py", line 1193, in _updateHiLo
    self._set_day_summary(_stats_dict, accumulator.timespan.stop, cursor)
  File "/usr/share/weewx/weewx/manager.py", line 1611, in _set_day_summary
    cursor.execute(_sql_replace_str, _write_tuple)
  File "/usr/share/weewx/weedb/mysql.py", line 63, in guarded_fn
    raise klass(e)
weedb.DatabaseError: (1213, 'Deadlock found when trying to get lock; try restarting transaction')

Unfortunately, it looks like weewxd stopped outputting to console after this traceback -- but did continue logging to journalctl.

Note: The console log file also contains output from mosquitto_sub -v for the topics this weewxd instance subscribes to ( cr1000/# and weatherflow/ap/# ). My thought was that these would mix between the LOOP flush lines from weewxd and give a better timeline / show that the messages were hitting the MQTT broker. And this did hold true up until the tracebacks above. Past that point, only the MQTT messages are printed into that file.

Hopefully you can find something from this data set.. if not, I'll try again. :)

Logs: https://wx.linuxkidd.com/weewx_debug_2023-10-07.zip Screenshots showing the gap: Screenshot from 2023-10-09 15-34-18 Screenshot from 2023-10-09 15-34-26

bellrichm commented 11 months ago

Sorry for the delay, I am puzzled by what I am seeing. I’ll work on summarizing it. Unfortunately it is more questions than answers. rich

linuxkidd commented 10 months ago

Hi Rich, no worries.. I'm glad I didn't miss something obvious.. :)

One thing I haven't mentioned that's atypical from a standard install.. I'm using Mariadb for the storage back end. I don't think it makes a difference, but thought it's worth knowing.

bellrichm commented 10 months ago

In the console messages.
Grepping on 'REC:' shows the following archive records have been created.

REC:    2023-10-07 22:20:00 MDT (1696738800) …
REC:    2023-10-07 22:25:00 MDT (1696739100) …
REC:    2023-10-07 22:30:00 MDT (1696739400) …
REC:    2023-10-07 22:35:00 MDT (1696739700) …
REC:    2023-10-07 22:40:00 MDT (1696740000) …
REC:    2023-10-07 22:45:00 MDT (1696740300) …
REC:    2023-10-07 22:50:00 MDT (1696740600) …

We see a loop packet LOOP: 2023-10-07 22:55:16 MDT (1696740916) created and the some exceptions are raised.
The first exception, weewx.engine.BreakLoop, is WeeWX’s way of breaking out of the generation of loop packets to create the archive record.
The second exception, NotImplementedError, is raised by MQTTSubscribeDriver because it does support creating archive records.
WeeWX catches this and starts the process to create the archive record from the loop packets.
All of this normal processing. As WeeWX attempts to write this record to the database a database deadlock is raised. This is where things get interesting.
I would have expected this to either bring WeeWX completely down or see some retry messages.

Unfortunately the syslog doesn't start until 10/08, so no information about the exception is in it. But grepping for 'latest' shows that WeeWX is still raising the BreakLoop exception and dispatching the POST_Loop event to kick off the report generation. But this event is not being 'handled' to update the database. (Note, I'd have to dig a bit more to get the exact event handling)

Another thing that confuses me is that your chart images show the outage starting at 10/08 at 12:25 and ending 10/08 at 3:25. I would have expect it to start at 10/07 at 10:55 (the time of the deadlock exception). And I see nothing that signifies WeeWX restarting.

Where do we go from here... Not sure. First question, the deadlock exception implies that some other process is updating the database. Do you have anything else updating it? Second, if the deadlock exception is the root cause, I would expect/hope that syslog would have more information on what is happening. So capturing that around the time of the start to the end of the gaps hopefully will yield more information.

Sorry I don't have more at this time. In the meantime, I'll reacquaint myself with WeeWX event handling.  

rich

linuxkidd commented 10 months ago

Hi Rich, Thanks for the update. I did have a process that runs an update to the database.. polling data from WeatherFlow's API and adjusting rain and lightning events data that they "corrected" via their crowd source adjustments. However, this process only ran at 3:03 am each day. So, the 22:55 event doesn't align with when anything else would be accessing the database.

There are multiple WeeWx instances running, but they have unique configs, databases, and mysql credentials. So there's practically zero chance of a database update collision from the different instances..

Is there any chance that the WeeWx process is doing something in parallel that could cause the transaction deadlock? Here's syslog from the 7th (through the 8th), hope that helps with the SQL deadlock bit.

https://wx.linuxkidd.com/weewx_debug.7-8.zip

Thanks again for your help!

bellrichm commented 10 months ago

Mike,

Base WeeWX has a main thread that interacts with the drivers and updates the DB. It launches a report thread every archive. So if something goes wrong in the report thread, the main thread stays up. Its possible an extension (service or skin) is launching a thread that updates the DB. But below I noted what I think is causing the deadlock, a second WeeWX process got launched.

I agree that nothing in your setup, multiple instances, separate process at 3:03, etc. line up with this outage.

In your log I noticed that the version of MQTTSubscribe is quite old. That is fine, but I recently added more logging in a development version. This would give insight into the actual archive record that is generated. We might need to use this, if you are up for it.

A lot of good information. But unfortunately I still see nothing that explains the gaps. To me, its looking less likely that it is MQTTSubscribe. But I can't be sure. We might have to start 'over' with some additional logging. I'll follow up with a different post with some possible ideas. Below is what I have inferred from the log

Oct 07 17:55:48 vpn.linuxkidd.com python3[2339583]: weewx[2339583] INFO weewx.engine: Loading station type MQTTSubscribeDriver (user.MQTTSubscribe)
Oct 07 17:55:48 vpn.linuxkidd.com python3[2339583]: weewx[2339583] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/TempHumPress has a mid 1 and rc 0
Oct 07 17:55:48 vpn.linuxkidd.com python3[2339583]: weewx[2339583] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Wind has a mid 2 and rc 0
Oct 07 17:55:48 vpn.linuxkidd.com python3[2339583]: weewx[2339583] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Rain has a mid 3 and rc 0
Oct 07 17:55:48 vpn.linuxkidd.com python3[2339583]: weewx[2339583] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Solar has a mid 4 and rc 0
Oct 07 17:55:48 vpn.linuxkidd.com python3[2339583]: weewx[2339583] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/obs_st has a mid 5 and rc 0
Oct 07 17:55:48 vpn.linuxkidd.com python3[2339583]: weewx[2339583] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/evt/strike has a mid 6 and rc 0
Oct 07 17:55:48 vpn.linuxkidd.com python3[2339583]: weewx[2339583] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status_hub has a mid 7 and rc 0
Oct 07 17:55:48 vpn.linuxkidd.com python3[2339583]: weewx[2339583] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status/tempest has a mid 8 and rc 0
Oct 07 17:55:48 vpn.linuxkidd.com python3[2339583]: weewx[2339583] DEBUG user.MQTTSubscribe: (Driver) Subscribing to 1696722948.305873-windGust-windGustDir-windDir-windSpeed has a mid 9 and rc 0
Oct 07 18:13:08 vpn.linuxkidd.com python3[2339583]: weewx[2339583] CRITICAL __main__: Keyboard interrupt.

From above, looks like you killed a WeeWX process running in the foreground. Then 3 WeeWX processes were started and killed. Unless you did that as part of getting the log and console, that is a bit strange.

Oct 07 18:15:05 vpn.linuxkidd.com python3[2348266]: weewx[2348266] INFO weewx.engine: Loading station type MQTTSubscribeDriver (user.MQTTSubscribe)
Oct 07 18:15:05 vpn.linuxkidd.com python3[2348266]: weewx[2348266] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/TempHumPress has a mid 1 and rc 0
Oct 07 18:15:05 vpn.linuxkidd.com python3[2348266]: weewx[2348266] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Wind has a mid 2 and rc 0
Oct 07 18:15:05 vpn.linuxkidd.com python3[2348266]: weewx[2348266] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Rain has a mid 3 and rc 0
Oct 07 18:15:05 vpn.linuxkidd.com python3[2348266]: weewx[2348266] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Solar has a mid 4 and rc 0
Oct 07 18:15:05 vpn.linuxkidd.com python3[2348266]: weewx[2348266] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/obs_st has a mid 5 and rc 0
Oct 07 18:15:05 vpn.linuxkidd.com python3[2348266]: weewx[2348266] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/evt/strike has a mid 6 and rc 0
Oct 07 18:15:05 vpn.linuxkidd.com python3[2348266]: weewx[2348266] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status_hub has a mid 7 and rc 0
Oct 07 18:15:05 vpn.linuxkidd.com python3[2348266]: weewx[2348266] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status/tempest has a mid 8 and rc 0
Oct 07 18:15:05 vpn.linuxkidd.com python3[2348266]: weewx[2348266] DEBUG user.MQTTSubscribe: (Driver) Subscribing to 1696724105.617899-windGust-windGustDir-windDir-windSpeed has a mid 9 and rc 0

Oct 07 18:19:21 vpn.linuxkidd.com python3[2348266]: weewx[2348266] INFO __main__: Received signal TERM (15).
Oct 07 18:20:31 vpn.linuxkidd.com python3[2350656]: weewx[2350656] INFO weewx.engine: Loading station type MQTTSubscribeDriver (user.MQTTSubscribe)
Oct 07 18:20:31 vpn.linuxkidd.com python3[2350656]: weewx[2350656] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/TempHumPress has a mid 1 and rc 0
Oct 07 18:20:31 vpn.linuxkidd.com python3[2350656]: weewx[2350656] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Wind has a mid 2 and rc 0
Oct 07 18:20:31 vpn.linuxkidd.com python3[2350656]: weewx[2350656] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Rain has a mid 3 and rc 0
Oct 07 18:20:31 vpn.linuxkidd.com python3[2350656]: weewx[2350656] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Solar has a mid 4 and rc 0
Oct 07 18:20:31 vpn.linuxkidd.com python3[2350656]: weewx[2350656] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/obs_st has a mid 5 and rc 0
Oct 07 18:20:31 vpn.linuxkidd.com python3[2350656]: weewx[2350656] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/evt/strike has a mid 6 and rc 0
Oct 07 18:20:31 vpn.linuxkidd.com python3[2350656]: weewx[2350656] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status_hub has a mid 7 and rc 0
Oct 07 18:20:31 vpn.linuxkidd.com python3[2350656]: weewx[2350656] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status/tempest has a mid 8 and rc 0
Oct 07 18:20:31 vpn.linuxkidd.com python3[2350656]: weewx[2350656] DEBUG user.MQTTSubscribe: (Driver) Subscribing to 1696724431.861790-windGust-windGustDir-windDir-windSpeed has a mid 9 and rc 0

Oct 07 19:07:33 vpn.linuxkidd.com python3[2350656]: weewx[2350656] INFO __main__: Received signal TERM (15).
Oct 07 19:07:49 vpn.linuxkidd.com python3[2372183]: weewx[2372183] INFO weewx.engine: Loading station type MQTTSubscribeDriver (user.MQTTSubscribe)
Oct 07 19:07:49 vpn.linuxkidd.com python3[2372183]: weewx[2372183] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/TempHumPress has a mid 1 and rc 0
Oct 07 19:07:49 vpn.linuxkidd.com python3[2372183]: weewx[2372183] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Wind has a mid 2 and rc 0
Oct 07 19:07:49 vpn.linuxkidd.com python3[2372183]: weewx[2372183] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Rain has a mid 3 and rc 0
Oct 07 19:07:49 vpn.linuxkidd.com python3[2372183]: weewx[2372183] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Solar has a mid 4 and rc 0
Oct 07 19:07:49 vpn.linuxkidd.com python3[2372183]: weewx[2372183] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/obs_st has a mid 5 and rc 0
Oct 07 19:07:49 vpn.linuxkidd.com python3[2372183]: weewx[2372183] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/evt/strike has a mid 6 and rc 0
Oct 07 19:07:49 vpn.linuxkidd.com python3[2372183]: weewx[2372183] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status_hub has a mid 7 and rc 0
Oct 07 19:07:49 vpn.linuxkidd.com python3[2372183]: weewx[2372183] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status/tempest has a mid 8 and rc 0
Oct 07 19:07:49 vpn.linuxkidd.com python3[2372183]: weewx[2372183] DEBUG user.MQTTSubscribe: (Driver) Subscribing to 1696727269.295238-windGust-windGustDir-windDir-windSpeed has a mid 9 and rc 0

Oct 07 19:13:06 vpn.linuxkidd.com python3[2372183]: weewx[2372183] INFO __main__: Received signal TERM (15).

Then this is started and left running.

Oct 07 19:23:14 vpn.linuxkidd.com python3[2378551]: weewx[2378551] INFO weewx.engine: Loading station type MQTTSubscribeDriver (user.MQTTSubscribe)
Oct 07 19:23:14 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/TempHumPress has a mid 1 and rc 0
Oct 07 19:23:14 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Wind has a mid 2 and rc 0
Oct 07 19:23:14 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Rain has a mid 3 and rc 0
Oct 07 19:23:14 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Solar has a mid 4 and rc 0
Oct 07 19:23:14 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/obs_st has a mid 5 and rc 0
Oct 07 19:23:14 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/evt/strike has a mid 6 and rc 0
Oct 07 19:23:14 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status_hub has a mid 7 and rc 0
Oct 07 19:23:14 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status/tempest has a mid 8 and rc 0
Oct 07 19:23:14 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to 1696728194.771473-windGust-windGustDir-windDir-windSpeed has a mid 9 and rc 0

Then another one is started this is the one that gets the deadlock when trying to update the DB. So that explains the deadlock.

Oct 07 22:18:43 vpn.linuxkidd.com python3[2446971]: weewx[2446971] INFO weewx.engine: Loading station type MQTTSubscribeDriver (user.MQTTSubscribe)
Oct 07 22:18:43 vpn.linuxkidd.com python3[2446971]: weewx[2446971] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/TempHumPress has a mid 1 and rc 0
Oct 07 22:18:43 vpn.linuxkidd.com python3[2446971]: weewx[2446971] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Wind has a mid 2 and rc 0
Oct 07 22:18:43 vpn.linuxkidd.com python3[2446971]: weewx[2446971] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Rain has a mid 3 and rc 0
Oct 07 22:18:43 vpn.linuxkidd.com python3[2446971]: weewx[2446971] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Solar has a mid 4 and rc 0
Oct 07 22:18:43 vpn.linuxkidd.com python3[2446971]: weewx[2446971] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/obs_st has a mid 5 and rc 0
Oct 07 22:18:43 vpn.linuxkidd.com python3[2446971]: weewx[2446971] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/evt/strike has a mid 6 and rc 0
Oct 07 22:18:43 vpn.linuxkidd.com python3[2446971]: weewx[2446971] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status_hub has a mid 7 and rc 0
Oct 07 22:18:43 vpn.linuxkidd.com python3[2446971]: weewx[2446971] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status/tempest has a mid 8 and rc 0
Oct 07 22:18:43 vpn.linuxkidd.com python3[2446971]: weewx[2446971] DEBUG user.MQTTSubscribe: (Driver) Subscribing to 1696738723.592871-windGust-windGustDir-windDir-windSpeed has a mid 9 and rc 0

The second WeeWX process gets errors trying to insert records that the first has already inserted.

Oct 07 22:20:17 vpn.linuxkidd.com python3[2446971]: weewx[2446971] ERROR weewx.manager: Unable to add record 2023-10-07 22:20:00 MDT (1696738800) to database 'weewx_astroparadise': (1062, "Duplicate entry '1696738800' for key 'PRIMARY'")
Oct 07 22:25:17 vpn.linuxkidd.com python3[2446971]: weewx[2446971] ERROR weewx.manager: Unable to add record 2023-10-07 22:25:00 MDT (1696739100) to database 'weewx_astroparadise': (1062, "Duplicate entry '1696739100' for key 'PRIMARY'")
Oct 07 22:30:18 vpn.linuxkidd.com python3[2446971]: weewx[2446971] ERROR weewx.manager: Unable to add record 2023-10-07 22:30:00 MDT (1696739400) to database 'weewx_astroparadise': (1062, "Duplicate entry '1696739400' for key 'PRIMARY'")
Oct 07 22:35:17 vpn.linuxkidd.com python3[2446971]: weewx[2446971] ERROR weewx.manager: Unable to add record 2023-10-07 22:35:00 MDT (1696739700) to database 'weewx_astroparadise': (1062, "Duplicate entry '1696739700' for key 'PRIMARY'")
Oct 07 22:40:18 vpn.linuxkidd.com python3[2446971]: weewx[2446971] ERROR weewx.manager: Unable to add record 2023-10-07 22:40:00 MDT (1696740000) to database 'weewx_astroparadise': (1062, "Duplicate entry '1696740000' for key 'PRIMARY'")
Oct 07 22:45:16 vpn.linuxkidd.com python3[2446971]: weewx[2446971] ERROR weewx.manager: Unable to add record 2023-10-07 22:45:00 MDT (1696740300) to database 'weewx_astroparadise': (1062, "Duplicate entry '1696740300' for key 'PRIMARY'")
Oct 07 22:50:17 vpn.linuxkidd.com python3[2446971]: weewx[2446971] ERROR weewx.manager: Unable to add record 2023-10-07 22:50:00 MDT (1696740600) to database 'weewx_astroparadise': (1062, "Duplicate entry '1696740600' for key 'PRIMARY'")

And eventually the deadlock and dies.

Oct 07 22:55:18 vpn.linuxkidd.com python3[2446971]: weewx[2446971] CRITICAL __main__: Caught unrecoverable exception:
Oct 07 22:55:18 vpn.linuxkidd.com python3[2446971]: weewx[2446971] CRITICAL __main__:     ****  (1213, 'Deadlock found when trying to get lock; try restarting transaction')
Oct 07 22:55:18 vpn.linuxkidd.com python3[2446971]: weewx[2446971] CRITICAL __main__:     ****  Exiting.

MQTTSubscribe gets disconnected from the MQTT broker, but recovers.

Oct 08 02:12:07 vpn.linuxkidd.com python3[2378551]: weewx[2378551] INFO user.MQTTSubscribe: (Driver) Disconnected with result code 1

Oct 08 02:12:08 vpn.linuxkidd.com python3[2378551]: weewx[2378551] INFO user.MQTTSubscribe: (Driver) Connected with result code 0
Oct 08 02:12:08 vpn.linuxkidd.com python3[2378551]: weewx[2378551] INFO user.MQTTSubscribe: (Driver) Connected flags {'session present': 0}
Oct 08 02:12:08 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/TempHumPress has a mid 10 and rc 0
Oct 08 02:12:08 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Wind has a mid 11 and rc 0
Oct 08 02:12:08 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Rain has a mid 12 and rc 0
Oct 08 02:12:08 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Solar has a mid 13 and rc 0
Oct 08 02:12:08 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/obs_st has a mid 14 and rc 0
Oct 08 02:12:08 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/evt/strike has a mid 15 and rc 0
Oct 08 02:12:08 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status_hub has a mid 16 and rc 0
Oct 08 02:12:08 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status/tempest has a mid 17 and rc 0
Oct 08 02:12:08 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to 1696728194.771473-windGust-windGustDir-windDir-windSpeed has a mid 18 and rc 0

Belchertown has trouble downloading forecast data. Probably a network glitch...

Oct 08 06:00:46 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.reportengine: Caught unrecoverable exception in generator 'weewx.cheetahgenerator.CheetahGenerator'
Oct 08 06:00:46 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.reportengine:         ****  Error downloading forecast data. and the error is: The read operation timed out

And later the MQTT publish also has a problem retrieving data from the db.Unfortunately WeeWX doesn't give us much information.

Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: MQTT: Unexpected exception of type <class 'pymysql.err.InterfaceError'>
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: *** Traceback (most recent call last):
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***   File "/usr/share/weewx/weewx/restx.py", line 382, in run_loop
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***     self.process_record(_record, dbmanager)
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***   File "/usr/share/weewx/user/mqtt.py", line 472, in process_record
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***     record = self.get_record(record, dbm)
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***   File "/usr/share/weewx/weewx/restx.py", line 294, in get_record
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***     _result = dbmanager.getSql(
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***   File "/usr/share/weewx/weewx/manager.py", line 579, in getSql
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***     _cursor.execute(sql, sqlargs)
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***   File "/usr/share/weewx/weedb/mysql.py", line 54, in guarded_fn
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***     return fn(*args, **kwargs)
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***   File "/usr/share/weewx/weedb/mysql.py", line 266, in execute
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***     self.cursor.execute(mysql_string, tuple(sql_tuple))
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***   File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 170, in execute
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***     result = self._query(query)
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***   File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 328, in _query
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***     conn.query(q)
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***   File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 516, in query
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***     self._execute_command(COMMAND.COM_QUERY, sql)
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***   File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 750, in _execute_command
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: ***     raise err.InterfaceError("(0, '')")
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] ERROR weewx.restx: *** pymysql.err.InterfaceError: (0, '')
Oct 08 06:36:19 vpn.linuxkidd.com python3[2378551]: weewx[2378551] CRITICAL weewx.restx: MQTT: Thread terminating. Reason: (0, '')
Oct 08 06:36:22 vpn.linuxkidd.com /etc/mysql/debian-start[2644056]: ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)
Oct 08 06:36:22 vpn.linuxkidd.com /etc/mysql/debian-start[2644056]: FATAL ERROR: Upgrade failed

And another process has an MQTT publish error.

Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: MQTT: Unexpected exception of type <class 'pymysql.err.InterfaceError'>
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: *** Traceback (most recent call last):
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***   File "/usr/share/weewx/weewx/restx.py", line 382, in run_loop
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***     self.process_record(_record, dbmanager)
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***   File "/usr/share/weewx/user/mqtt.py", line 472, in process_record
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***     record = self.get_record(record, dbm)
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***   File "/usr/share/weewx/weewx/restx.py", line 294, in get_record
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***     _result = dbmanager.getSql(
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***   File "/usr/share/weewx/weewx/manager.py", line 579, in getSql
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***     _cursor.execute(sql, sqlargs)
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***   File "/usr/share/weewx/weedb/mysql.py", line 54, in guarded_fn
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***     return fn(*args, **kwargs)
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***   File "/usr/share/weewx/weedb/mysql.py", line 266, in execute
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***     self.cursor.execute(mysql_string, tuple(sql_tuple))
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***   File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 170, in execute
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***     result = self._query(query)
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***   File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 328, in _query
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***     conn.query(q)
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***   File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 516, in query
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***     self._execute_command(COMMAND.COM_QUERY, sql)
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***   File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 750, in _execute_command
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: ***     raise err.InterfaceError("(0, '')")
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] ERROR weewx.restx: *** pymysql.err.InterfaceError: (0, '')
Oct 08 06:36:22 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] CRITICAL weewx.restx: MQTT: Thread terminating. Reason: (0, '')

And what I am calling the main process has an error and recovers.

Oct 08 06:40:16 vpn.linuxkidd.com python3[2378551]: weewx[2378551] CRITICAL __main__: Database connection exception: (1927, 'Connection was killed')
Oct 08 06:40:16 vpn.linuxkidd.com python3[2378551]: weewx[2378551] CRITICAL __main__:     ****  Waiting 60 seconds then retrying...
Oct 08 06:41:16 vpn.linuxkidd.com python3[2378551]: weewx[2378551] INFO weewx.engine: Loading station type MQTTSubscribeDriver (user.MQTTSubscribe)
Oct 08 06:41:16 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/TempHumPress has a mid 1 and rc 0
Oct 08 06:41:16 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Wind has a mid 2 and rc 0
Oct 08 06:41:16 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Rain has a mid 3 and rc 0
Oct 08 06:41:16 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to cr1000/Solar has a mid 4 and rc 0
Oct 08 06:41:16 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/obs_st has a mid 5 and rc 0
Oct 08 06:41:16 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/evt/strike has a mid 6 and rc 0
Oct 08 06:41:16 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status_hub has a mid 7 and rc 0
Oct 08 06:41:16 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to weatherflow/ap/status/tempest has a mid 8 and rc 0
Oct 08 06:41:16 vpn.linuxkidd.com python3[2378551]: weewx[2378551] DEBUG user.MQTTSubscribe: (Driver) Subscribing to 1696768876.441552-windGust-windGustDir-windDir-windSpeed has a mid 9 and rc 0

And 1701232 recovers from some error...

Oct 08 06:41:16 vpn.linuxkidd.com python3[1701232]: turtleherding[1701232] INFO weewx.engine: Loading station type MQTTSubscribeDriver (user.MQTTSubscribe)

And the 'main' process is killed

Oct 08 08:22:38 vpn.linuxkidd.com python3[2378551]: weewx[2378551] INFO __main__: Received signal TERM (15).

And a new one is started, but not logging DEBUG messages.

Oct 08 08:22:48 vpn.linuxkidd.com python3[2695547]: ap[2695547] INFO weewx.engine: Loading station type MQTTSubscribeDriver (user.MQTTSubscribe)
linuxkidd commented 10 months ago

HI Rich, Ya, I restarted WeeWx multiple times, trying to get the full logging output working right ( not realizing it was still writing to syslog / journalctl even though I was running in foreground. I'm glad you figured out the database deadlock, and it easily fits with what I'd been doing.

I also ran some strace instances against WeeWx hope'n to find the culprit, but alas, they got cleaned from /tmp before I had a chance to review them.

I'm happy to upgrade MQTTSubscribe and see if we can get some more detail. The second instance of WeeWx is running as turtleherding.. Interestingly, I get gaps on both instances, but they don't align.

Oct 08 06:36:22 vpn.linuxkidd.com /etc/mysql/debian-start[2644056]: ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)
Oct 08 06:36:22 vpn.linuxkidd.com /etc/mysql/debian-start[2644056]: FATAL ERROR: Upgrade failed

This is from a restart of the Mariadb instance.. there's an upgrade script that runs during the startup to check for the need for upgrade, but since I have a password on the root account, it always fails. I run the upgrade manually / interactively any time the package updates so I can supply the password.

The process term / relaunch without debug at 08:22 was when I stopped the collection and re-fired the default service after removing the debug option.

Upgrading MQTTSubscribe now, and can spend more time collecting data in the next day or two.

Thanks again for your help! Michael

linuxkidd commented 10 months ago

Upgraded to latest commits, reporting Version is 2.2.3-rc02. I'm running in debug mode on console output now.. I've had some pretty big gaps in the past day, so hopefully by morning it'll be reproduced.

bellrichm commented 10 months ago

You’re quick :) I was also going to ask you to set log_success = True and log_failure = True under [StdArchive]. My understanding is this will log success and failures of the actual db update. We can see if the update to MQTTSubscribe uncovers anything first. Your call. Rich

linuxkidd commented 10 months ago

Zero gaps overnight.. So, I just adjusted my settings as you requested and restarted the process from console again. Eagerly awaiting another failure... :D

bellrichm commented 10 months ago

Gotta’ love intermittent bugs. I’ll get notified when you have something.

linuxkidd commented 10 months ago

Still not a single blip in either of my WeeWx instances after upgrading the MQTTSubscribe extension. I'm gonna go back to normal logging and running.. if there's still nothing by end of week, I'll close the issue. If it does blip, I'll re-enable debug and try to catch it again.

Thanks, Rich! I hope it's just the code update that fixes it. :)

bellrichm commented 10 months ago

Sounds like a good plan. I did do a quick review of the commits between the MQTTSubscribe versions, nothing jumped out at me. I'll take another look. For my reference, you were at 1.6.2.

linuxkidd commented 10 months ago

Hi Rich, not a single new gap since the upgrade of MQTTSubscribe. Closing the ticket.. :) Thanks again for your help! o/

bellrichm commented 10 months ago

Thanks for the update. From the commit comments I do not see anything that would be a fix, but I might have got sloppy/lazy… Glad its now working, if something comes up you know where to find me. FYI, I will soon be making a formal release. It will be the commit you are running withe version updated. rich