Feller-AG / wiser-api

MIT License
12 stars 1 forks source link

Device reboots sporadicly #4

Open hansfriedrich opened 1 year ago

hansfriedrich commented 1 year ago

Not sure this is the right repo, but I'm facing this issue currently. I suppose this is the root cause my web socket connection gets interrupted sporadicly. Can someone help me on this?

----------------------------------------------------------------------
Crash on 2021-12-11 08:41:56

Instance 34758,  IP [DEVICE-ID],  SSID [MY-SSID]
uGateway v5.0.4-0-g5ebcdfe built on 2021-05-22 00:36:36
MicroPython v1.14.0-142-g569acf3 for LISA_F413RH
GainSpan 5.8.0 built on 2019-05-28 12:10:09
Bootloader 1.4.1
sockets = 10
mem_size = 291776
mem_free = 28144
flash_size = 26210304
flash_free = 22122496
wlan_rssi = -65
wlan_resets = 0
uptime = 117459
reboot_cause = AT_TIMEOUT

Traceback (most recent call last):
  File "main.py", line 110, in 
  File "uasyncio/core.py", line 109, in run_forever
  File "gateway/app.py", line 244, in app_task
  File "gateway/nubes/telemetry.py", line 48, in publish_kplus
  File "gateway/nubes/telemetry.py", line 40, in publish
  File "mqtt/client.py", line 215, in publish
  File "helpers.py", line 65, in json_dumps
MemoryError: memory allocation failed, allocating 392 bytes
oliver-joos commented 1 year ago

I can confirm that this MemoryError causes the Websocket and WLAN connection to drop for 30-60 seconds.

I see "mem_free = 28144" which is pretty low. How many devices/loads do you have?

hansfriedrich commented 1 year ago

Hi Oliver, Thank you for the confirmation. I „solved“ this issue by continued checking the connection and reconnect if necessary. But I haven’t heard about this issue from the users of my plugin. Maybe they don’t have my count of devices. Currently there are 20 loads within 22 devices in my installation.

oliver-joos commented 1 year ago

Usually we don't see MemoryError with < 50 devices. A few month ago we found a bug: the mobile App created new objects in the µGateway (I think /api/jobs) but never deleted those that are not in use anymore. Meanwhile we and the App developers solved this problem. But I don't now enough about App releasing to be sure if the latest version includes this bug fix. Please have some patience. We will come up with a solution for your µGateway after our summer holidays, probably next week.

hansfriedrich commented 1 year ago

Hi Oliver, thanks for these valuable informations. I haven't touched the /api/jobs till now. I'll have a look whether there are jobs created after my summer holidays and check for the never deleted jobs.

As you may noticed my µGateway runs on version 5.0.4. Is there something like a change log for the releases, or even an repository to manually update the firmware? My µGateway tells me it runs on the latest version even though you documented the latest with 5.0.6.

oliver-joos commented 1 year ago

I can say that in 5.0.5 we fixed minor bugs from 5.0.4. And 5.0.6 is equal to 5.0.5 (only changes to the WebApp interface). Your MemoryError will not be fixed with the latest version 5.0.x.

But since June 2021 we are working on 5.1.x with new features and bug fixes! This also supports more devices/requires less memory per device. I think if the release of 5.1.x takes longer than expected, we might send you a pre-release version of the software. And we could find out together why your µGateway has so little memory left. Me and most of our team will be back from vacation on August 8th.

Apart from that, with Websockets it seems wise to automatically reconnect if it gets disconnected, for whatever reason. This was/is currently being implemented in our app as well.

woodworm commented 1 year ago

For easier collaboration we can provide pach vesions. To do this, you need to write an email to our Customercare Center with the note "API request at woodworm".

hansfriedrich commented 1 year ago

@woodworm sent a mail to your colleagues with the given note end of August. Unfortunately I got no response till now. An update on the Jobs. There were several (ca. 40) jobs. Deleting them had no effect - except all "Nebenstellen" were deleted ;-)

hansfriedrich commented 1 year ago

Btw the same stack within the last crash:

----------------------------------------------------------------------
Crash on 2021-12-11 08:41:56

Instance 34758,  IP [DEVICE-ID],  SSID [MY-SSID]
uGateway v5.0.4-0-g5ebcdfe built on 2021-05-22 00:36:36
MicroPython v1.14.0-142-g569acf3 for LISA_F413RH
GainSpan 5.8.0 built on 2019-05-28 12:10:09
Bootloader 1.4.1
sockets = 10
mem_size = 291776
mem_free = 28144
flash_size = 26210304
flash_free = 22122496
wlan_rssi = -65
wlan_resets = 0
uptime = 117459
reboot_cause = AT_TIMEOUT

Traceback (most recent call last):
  File "main.py", line 110, in 
  File "uasyncio/core.py", line 109, in run_forever
  File "gateway/app.py", line 244, in app_task
  File "gateway/nubes/telemetry.py", line 48, in publish_kplus
  File "gateway/nubes/telemetry.py", line 40, in publish
  File "mqtt/client.py", line 215, in publish
  File "helpers.py", line 65, in json_dumps
MemoryError: memory allocation failed, allocating 392 bytes
woodworm commented 1 year ago

I don't think this is the problem. This exception is from last year (2021-12-11).... if the problem is serious, the log should be full of new exception messages. Unfortunately I linked the wrong e-mail address from our customercare center 🙈 Sorry for that! Could you please contact me again with this address customercare Center

hansfriedrich commented 1 year ago

Thanks for the correction. I’ll have another try. Oh your right. Sorry for that inadvertence. That’s from last year. Anyway, the device did his reboot. 501DEF15-8CBF-413D-AEAA-AAC19C4568D1

is there a proper way to see the logs?

woodworm commented 1 year ago

AT_EVENT_18 is a disassociation event at WLAN level. What is your RSSI (GET api/net/rssi)?

hansfriedrich commented 1 year ago
{
    "data": {
        "rssi": -75
    },
    "status": "success"
}

I suggest that's not optimal

woodworm commented 1 year ago

If you need a permanent connection, a better RSSI would be an advantage 🤔. You may be able to move the router a little or put the µGW in a different location on an actuator.

neckcen commented 1 year ago

I've firewalled my wiser gateway and noticed that it will reboot itself approximately every hour if it cannot reach the MQTT servers. Uptime can be significantly lower when there is activity (down to 15min ish), as such I had to temporarily allow MQTT connection in order to apply firmware upgrades.

I'm not sure this is intended, if so it would be great to be able to turn auto-reboot off.

woodworm commented 1 year ago

This behavior is not what we want. The µGW must work without an internet connection. Please contact me on customercare Center at woodworm. I need more details information to solve this problem.

woodworm commented 1 year ago

@neckcen We could now reproduce it. The issues will be solved in version 5.1.22

neckcen commented 1 year ago

@woodworm you have my thanks, happy to see a fix planned so quickly!

I had reached to you through customer support (case 98017092), the ticket can be closed if it isn't the case already.

woodworm commented 1 year ago

that's fine. we will deploy the version 5.1.22 in the next few weeks.