blog-eivindgl-com / netatmo-pyportal-display

CircuitPython project to use PyPortal as display for a Netatmo weather station
MIT License
1 stars 0 forks source link

OutOfRetries: Repeated socket failures #10

Open eloekset opened 1 year ago

eloekset commented 1 year ago

An unhandled exception crashing the display. image

eloekset commented 1 year ago

Maybe this is a workaround: https://github.com/adafruit/Adafruit_CircuitPython_AzureIoT/issues/44#issuecomment-1127489978

eloekset commented 1 year ago

I'm not 100% sure yet, but it looks like this finally did the trick: https://github.com/adafruit/Adafruit_CircuitPython_ESP32SPI/issues/170#issuecomment-1676183587

My Netatmo PyPortal display has been running for several hours without any crashes or hangs today after applying this fix.

eloekset commented 1 year ago

It works way better now, but it can still freeze. image It was connected to the VSCode console over night and hung on refresh after pyportal.fetch().

The display looks updated based on the latest values from the VSCode console log: image

But the next call to pyportal.fetch() caused the freeze after "Reply is OK!" was logged.

eloekset commented 1 year ago

Log in VSCode console shows that 392 iterations of loading data was performed before the freeze. After adding TimeoutError to the except block and resetting PyPortal, it ran just 10 iterations before the next freeze. Seems very random why it freezes like this.

    if (not weather_refresh) or (time.monotonic() - weather_refresh) > 60:
        try:
            value = pyportal.fetch()
            reload_count = reload_count + 1
            print("#%d: Response is" % reload_count, value)
            gfx.draw_display(value)
            gfx.clear_error()
            weather_refresh = time.monotonic()
        except (ValueError, RuntimeError, ConnectionError, OSError, TimeoutError) as e:
            print("Some error occured, retrying! -", e)
            esp.reset()
            esp.disconnect()
            pyportal.network.connect()
            time.sleep(5)
            continue
Free mem before updates: 35184B
Retrieving data...Reply is OK!
#10: Response is {"widgets": [{"minTime": "26.08 22:04", "maxTime": "27.08 07:50", "maxValue": "24.5°C", "description": "Stua", "minValue": "24.1°C", "type": "temperature", "value": "24.5°C", "trend": "stable"}, {"minTime": "27.08 04:22", "maxTime": "27.08 07:49", "maxValue": "15.0°C", "description": "Vestveggen ute", "minValue": "11.9°C", "type": "temperature", "value": "15.0°C", "trend": "up"}, {"minTime": "26.08 22:04", "maxTime": "27.08 07:19", "maxValue": "29.0°C", "description": "Serverskapet", "minValue": "28.4°C", "type": "temperature", "value": "29.0°C", "trend": "stable"}, {"minTime": "27.08 04:52", "maxTime": "27.08 07:53", "maxValue": "15.5°C", "description": "Østveggen ute", "minValue": "12.8°C", "type": "temperature", "value": "15.5°C", "trend": "up"}, {"outTemp": 15, "batteryIndicators": [{"moduleName": "Vestveggen ute", "batteryLevel": 77}, {"moduleName": "Serverskapet", "batteryLevel": 83}, {"moduleName": "Østveggen ute", "batteryLevel": 85}, {"moduleName": "EiVind", "batteryLevel": 82}], "description": "Vestveggen ute", "sunOrMoon": "sun", "type": "humidity", "value": "86", "batteryLevel": 77}, {"outTemp": 29, "batteryIndicators": [{"moduleName": "Vestveggen ute", "batteryLevel": 77}, {"moduleName": "Serverskapet", "batteryLevel": 83}, {"moduleName": "Østveggen ute", "batteryLevel": 85}, {"moduleName": "EiVind", "batteryLevel": 82}], "description": "Serverskapet", "sunOrMoon": "sun", "type": "humidity", "value": "37", "batteryLevel": 83}, {"outTemp": 15.5, "batteryIndicators": [{"moduleName": "Vestveggen ute", "batteryLevel": 77}, {"moduleName": "Serverskapet", "batteryLevel": 83}, {"moduleName": "Østveggen ute", "batteryLevel": 85}, {"moduleName": "EiVind", "batteryLevel": 82}], "description": "Østveggen ute", "sunOrMoon": "sun", "type": "humidity", "value": "84", "batteryLevel": 85}, {"angle": "270", "maxAngle": "21", "maxValue": "2.2m/s", "description": "Vind", "maxTime": "26.08 23:21", "type": "wind", "value": "0.6m/s", "batteryLevel": 82}, {"angle": "334", "maxAngle": "21", "maxValue": "2.2m/s", "description": "Kast", "maxTime": "26.08 23:21", "type": "wind", "value": "1.1m/s", "batteryLevel": 82}]}
tempInt:  24
tempDec:  .5°C
Set trend icon to /icons/wi-trend-flat.bmp
modules length: 2
This temp: 15.5°C was parsed to 15.5
low temp: 15.0°C was parsed to 15.0
lowTempModule: {'value': '15.0°C', 'trend': 'up', 'minTime': '27.08 04:22', 'description': 'Vestveggen ute', 'maxTime': '27.08 07:49', 'type': 'temperature', 'maxValue': '15.0°C', 'minValue': '11.9°C'}
tempInt:  15
tempDec:  .0°C
Set trend icon to /icons/wi-trend-up.bmp
modules length: 2
This temp: 15.5°C was parsed to 15.5
low temp: 15.0°C was parsed to 15.0
lowTempModule: {'value': '15.0°C', 'trend': 'up', 'minTime': '27.08 04:22', 'description': 'Vestveggen ute', 'maxTime': '27.08 07:49', 'type': 'temperature', 'maxValue': '15.0°C', 'minValue': '11.9°C'}
tempInt:  15
tempDec:  .0°C
Set trend icon to /icons/wi-trend-up.bmp
Set weather icon to /icons/wi-day-rain.bmp
Set weather icon to /icons/wi-day-sunny.bmp
Set weather icon to /icons/wi-day-showers.bmp
wind strength:  0.6
wind angle:  270.0
wind strength:  1.1
wind angle:  334.0
Free mem after updates: 35184B
This iteraton took 0B 
Free mem before updates: 35184B
Free mem after updates: 35184B
This iteraton took 0B 
Free mem before updates: 35184B
Retrieving data...Reply is OK!

Since the text Some error occured, retrying! isn't logged, the except block isn't reached before the freeze. It's pretty clear that the freeze happens somewhere inside the call to value = pyportal.fetch().

eloekset commented 1 year ago

I even bought a new PyPortal Titano in case it was hardware related. The new device ran for about 20 hours before freezing. That was 1120 iterations, so way more than on the first device.

According to the console log, it freezes on pyportal.network.requests.get(), response.json() or response.close(): image image

Connecting to AP GID12A_Serverskapet indicates that pyportal.network.connect() was executed from the except block right before the iteration when the device was freezing.

eloekset commented 1 year ago

After wrapping the error handling that does reset and reconnect in a recursive function, it keeps failing with error message "invalid syntax for integer with base 16": image

It goes on and on forever with the same error message. The first time, the error message was "ESP32 not responding": image