Closed jjpavlik closed 4 years ago
So:
NameError: name 'Exceprion' is not defined
NameError: name 'receive_message' is not defined
So these two things are addressed in https://github.com/jjpavlik/homemetrics/tree/Issue-%2318 . There's something interesting here though. The second crash (receive_message one) suggests the Arduino sent an error message back! :O need to know what it wast, because I didn't get to see if the ERRORS counter was increased in the LCD. Leaving this branch running for a few days to see if it catches anything
Looks like on the test I ran last night (unplugging the network cable and plugin it back):
However... now collector.py for some reason is steadily producing 20 messages per period, and since pusher is only consuming up to 10 per period, there's a steady growth of 10 messages per period since then:
Collector logs show the following since last night:
2020-03-14 22:05:38,268 - root - INFO - A few measurements queuing locally :O 6 trying to push them now 2020-03-14 22:05:38,792 - root - INFO - A few measurements queuing locally :O 6 trying to push them now 2020-03-14 22:05:39,306 - root - INFO - A few measurements queuing locally :O 6 trying to push them now 2020-03-14 22:05:39,912 - root - INFO - A few measurements queuing locally :O 6 trying to push them now
This suggests the queue is never going back to 0, and it queued 6 measures. I believe the problem is the retry mechanism inside https://github.com/jjpavlik/homemetrics/blob/Issue-%2318/collector.py#L59 . The big issue seems to be the for loop won't remove from the list the measures that were actually pushed and it will indeed re-append (duplicating) the ones that failed during the time the interface went down.
Just pushed commit https://github.com/jjpavlik/homemetrics/commit/8386fe82c44db56f7604795831b73730755d3094#diff-f306c1eaf970996da5b4dddb8261d048
Now the queue size is slowly going down, however I still need to check collector.py can gracefully survive a link down link up scenario.
After 2 days and having tested forcing a link down/up looks like things are on track. Will close this one and see if the problems shows up again in the future.
Linked to https://github.com/jjpavlik/homemetrics/issues/12 the problem keeps happening.