rstrouse / ESPSomfy-RTS

A controller for Somfy RTS shades and blinds
The Unlicense
429 stars 32 forks source link

2.4.3-Pre feedback #360

Closed alka79 closed 1 month ago

alka79 commented 2 months ago

Hello, I played all day with 2.4.3pre. I recompiled locally with the latest Arduino core 2.0.16. Update via OTA went smoothly. As they might depend on each other, I never know which to upload first, littlefs or firmware ? I uploaded App first.

2.4.3pre has been stable all day. I can see in the serial output that some network stuff is async now. The group's rollingcode is now in the backupfile, right after the bits length. I did not try to fully restore though.

The new feature is the MY Simulation for shades that don't have a MY memory. I have plenty of those old shutters and was eager to try that feature. Set a My position and check the simMy option in the settings of the shades, hit the Shade's My button in the WebUI and the shade moves gently to the set My Position :)

It does not work though with the groups. Hit the My button in the WebUI sends the My command to all paired members. Of course my old shutters ignore this. A nice improvement would be to check the group members and give a special treatment to those with MySim setting to move them one by one to their My position.

My expectation was to be able to also use the physical remote My button with my shutters. Unfortunately it is not implemented. I hacked the code in SomfyShade::processWaitingFrame() to add a call to moveToTarget when a My command is received for a shade with simMy. That works ! it also works with all the linked remotes :) I hope it can be officially implemented. I am afraid that my hack might have uncontrolled (by me) side effects.

rstrouse commented 2 months ago

You should always install the firmware first. That way if there are any dependencies on the script the supported elements will be available.

I have posted updates to the code to allow for the Simulated my functions to be operated via remote and using groups. There is also a hardware watchdog now that will reboot the ESP32 if any process takes longer than 5 seconds and does not feed the dog.

EDIT: And one more thing I had to move the rolling code for groups to the end of the file. This way those folks that downgrade can still use a newer backup file.

alka79 commented 2 months ago

that was fast. I'll look into it and report back (and also look how you implemented. I am getting curious now :) )

EDIT: I see what you've done. I tried at first with moveToMyPosition() but I had issues as the shade was not Idle anymore when called. I realize now that is because I called it just a few lines later than you, when the target position was set. It was poor hacking on my side :)

You should always install the firmware first. That way if there are any dependencies on the script the supported elements will be available.

It could well be reversed. The old app might not expect a change in the new firmware and hang and then I can not update via OTA anymore. Not knowing and having the possibility to upload firmware with Arduion, I felt more secure to update App first. Now that I know, I'll just do firmware first.

EDIT: And one more thing I had to move the rolling code for groups to the end of the file. This way those folks that downgrade can still use a newer backup file.

clever :)

alka79 commented 1 month ago

I keep reporting here. I installed 2.4.3 Pre second release. My on the Physical remote control works fine with all linked remotes :)

However some strange behaviour:

  1. watchdog has reset twice in the night (3:44:33 and 3:45:38 PM) for no apparent reason. Here is the serial output. Aslo there is a huge number of unexplained onvif/subscription requests starting from 3:46 PM after the second reset. Maybe it is the HA onvif integration ? I reset manually the board and got a new IP Address. These onvif requests are now gone. Never got anything like that before. Strange.

  2. issue with My button of a group Shade # 3 has a My position with simMy. It is member of group # 4 and is the only one in the group with simMy checked. I send the commands from the WebUI. Individually, Shade3 reacts well. As member of a group the command to move to target to My postion are sent but the shutter does not move at all. For each try, initial position of Shade3 is fully down, ie 100% closed.

Related extracts from serial output :

rstrouse commented 1 month ago

watchdog has reset twice in the night (3:44:33 and 3:45:38 PM) for no apparent reason.

Perhaps 5 seconds is not enough. More likely though there is something that can potentially be long running that is not feeding the dog. I'll look through it to see what I see.

Individually, Shade3 reacts well. As member of a group the command to move to target to My postion are sent but the shutter does not move at all.

Increase the number of repeats to shade 3. I believe some motors miss incoming frames when they are processing the last frame even if it is not for them. During this time it probably misses the next preamble and does not begin decoding the frame. Hence it never sees the command. If you were using a remote you probably would not notice because repeats come very fast when you press the button. It is not uncommon to get 2-3 repeats from pressing the button.

That being said I may try to add additional cooling off period after the last frame to see if this additional silence allows slower motors to keep up.

Aslo there is a huge number of unexplained onvif/subscription requests starting from 3:46 PM after the second reset.

I would bet that this is your onvif integration. These probably started with SSDP responses but instead of calling the returned device profile url somewhere in that code it thinks that ESPSomfy RTS has a sensor or camera data that it wants. The call is most likely asking for the UPnP description of a service.

This is really why devices such as these should have an IP reservation so the router does not hand out new IPs when a referencing device has cached it. The udp datagrams for ESPSomfy RTS do not contain the specified url and a 404.3 response is the correct response to the request.

The calling device should have given up or rediscovered the devices. Given the timing there would also have been quite a few ::alive udp packets that should have been picked up between the calls for both ESPSomfy RTS and perhaps the actual device it was processing discovery.

alka79 commented 1 month ago

I forgot to give my new board a fixed DHCP address. It's done. Maybe the IP Address received after the nightly reset by DHCP was previously assigned to my camera and HA somehow remembers. I should give fixed addresses to those fixed devices!

Increase repeats from 1 to 2 for shade3 helped. 'Group My' now works. I see that there is no delay at all when sending to shades in groups. My old shades probably not like the burst.

No watchdog issue during the day. The serial output is still running. I'll report if there is something.

alka79 commented 1 month ago

Hi Again tonight, watchdog restart for no apparent reason. Serial Output

rstrouse commented 1 month ago

After the reboot it appears to perform the correct operation for when the network is not allowing connections. If you look at the two lines indicating disconnected from access point at 01:44:32.022 you will see that it was never issued an IP address from the AP. ESPSomfy RTS will then open the SoftAP.

However, after the SoftAP times out after 3 minutes it thinks that it is still connecting from the previous attempt that aborted unceremoniously in between getting a valid AP and still not getting an IP address. It looks like I did not clear the connecting flag on the first failure condition since I do not see it resetting the station mode. It is simply opening and closing the SoftAP.

rstrouse commented 1 month ago

Yep what I explained above is exactly the condition I did not expect. It is weird that the SSID would partially connect but not issue an IP address and simply reject the connection on the first try. But that could simply be a case where the router was rebooting and its DHCP services were not yet available. I suspect that simply clearing the connecting flag when the SoftAP is opened will capture this condition completely.

The important bits about what triggered the watchdog timer is missing from your log. This would have been the lines just above the watchdog trigger log entry (the first line). Unfortunately the stack trace from will always point to the ESP32 library location in the core and will tell us nothing about what triggered it. I have a suspicion that the trigger is a delay in response when checking whether there is internet available.

rstrouse commented 1 month ago

I updated the 2.4.3 release to clear the connecting flag when the soft ap is started. Please update the v2.4.3 firmware on your device.

rstrouse commented 1 month ago

@alka79 did the v2.4.3 updates perform better overnight for you?

alka79 commented 1 month ago

The important bits about what triggered the watchdog timer is missing from your log. This would have been the lines just above the watchdog trigger log entry (the first line). Unfortunately the stack trace from will always point to the ESP32 library location in the core and will tell us nothing about what triggered it. I have a suspicion that the trigger is a delay in response when checking whether there is internet available.

There is actually nothing on the serial port just before the reset by watchdog

I was out a few days. Just reinstalled latest 2.4.3 Pre. For a change, I installed via github. At the end of the process the board resets wich is normal behaviour. Then just after there was a second reset due to the watchdog which seems abnormal to me. Now, each time I try to connect from the web UI, the board resets.

Here is the serial output of the three resets

EDIT: I uploaded my local build with Arduino and the /data with littleFS upload tool. Restored from backup file and it is back on track. The previous github /data upload probably failed.

rstrouse commented 1 month ago

I see that there are places that are taking 4 seconds to download a file from the file system. I’ll look at that streaming function to see if there is a place to feed the dog. Once it is cached it will be ok but each reboot got one more file and the index.js file was just too slow.

rstrouse commented 1 month ago

Looking deeper at that log it does look like it had a corrupt fs. See shade config file invalid.

alka79 commented 1 month ago

Looking deeper at that log it does look like it had a corrupt fs. See shade config file invalid.

That is what I think as well. The /data upload via OTA went wrong for some unknown reason. Since I uploaded manually via littlefs upload tool, it is working fine again.

As per the watchdog : can you set it's food timing expectation to be variable ? during OTA or github update, 5sec is not suspicious.

rstrouse commented 1 month ago

I moved the dog bowl so that it resets regardless of whether the data stream is suspended or not. The process was already looking for stream timeouts.

rstrouse commented 1 month ago

I increased the wdt reset to 7 seconds. If it is bouncing now it will get fixed in the next release. Please report any issues in a new issue. Thanks for the help!