Open dumpfheimer opened 2 years ago
A backup is taken the moment the radio starts up so if the serial port loses connection and zigpy-znp reconnects, a new backup will be taken every time.
I've modified my local setup to take a complete backup over and over in the background, with a 0 second delay between each one. I experience only a tiny delay sending requests but otherwise no noticeable impact so far in the past 10 minutes. This with the same beta firmware, on the same TI CC1352p dev kit with no flow control enabled.
Then I need to find out why my device is seemingly randomly disconnecting
Any idea what could cause this? Last log lines before close (Did not shut down HA)
2022-08-30 20:06:34.467 DEBUG (MainThread) [homeassistant.components.zha.core.gateway] Shutting down ZHA ControllerApplication
2022-08-30 20:06:34.472 DEBUG (MainThread) [homeassistant.components.zha.core.device] [0x2462](LCT003): last_seen is 11571.180652618408 seconds ago and ping attempts have been exhausted, marking the device unavailable
2022-08-30 20:06:34.472 DEBUG (MainThread) [homeassistant.components.zha.core.device] [0x2462](LCT003): Update device availability - device available: False - new availability: False - changed: False
2022-08-30 20:06:34.489 DEBUG (MainThread) [zigpy_znp.api] Sending request: SYS.ResetReq.Req(Type=<ResetType.Soft: 1>)
2022-08-30 20:06:34.490 DEBUG (MainThread) [zigpy_znp.api] Request has no response, not waiting for one.
2022-08-30 20:06:34.491 DEBUG (MainThread) [zigpy_znp.uart] Closing serial port
Aah, its the "Update configuration" button in the UI (Integration page). Is this to be expected?
Yeah. It fully reloads ZHA when adjusting configuration to be safe but we can probably make it less intrusive, eventually.
Is the pyserial .write method thread/concurrency safe?
Almost nothing in asyncio is threadsafe so I wouldn't rely on it.
I ordered a CC1352P7. It seems to be pretty much the same as the P2 except that it has more memory.
On a side note: I yesterday tried and succeeded with creating a backup from ZNP and restoring it on my Conbee II. This is rediculously genius! (Reverted back because ZNP seemed to work much more reliably)
Got the board, made a P7 firmware with some changes from koenkk's. Is up and running :-) for now..
@puddly is zigpy-znp purposely suprressing route discovery mechanisms? https://github.com/zigpy/zigpy-znp/blob/824c2b2ade1e2ecfeb55087b9375a1df33eebb34/zigpy_znp/zigbee/application.py#L292
If my quick search was correct the other libraries seem to not use a similar flag?
I now seem to have an extremely well performing network in comparison to before with:
NONE
rather than SUPPRESS_ROUTE_DISC_NETWORK
If my quick search was correct the other libraries seem to not use a similar flag?
If I remember correctly, it was used by other libraries in the past, though incorrectly named: https://github.com/Koenkk/zigbee-herdsman/search?q=DISCV_ROUTE
MTORR are broadcast periodically by the coordinator (check with a Zigbee sniffer), in addition to being explicitly requested by zigpy-znp when a device is unreachable. I believe the original reasoning was to reduce unnecessary runtime network traffic.
Just for completeness, here are the definitions from zigpy, z2m and z-stack:
zigpy:
SUPPRESS_ROUTE_DISC_NETWORK = 0x20 # dec 32
SKIP_ROUTING = 0x80 # dec 128
z2m
DISCV_ROUTE: 32,
SKIP_ROUTING: 128
Z-Stack Stack/af/af.h
#define AF_SUPRESS_ROUTE_DISC_NETWORK 0x20 // Supress Route Discovery for intermediate routes
// (route discovery preformed for initiating device)
#define AF_SKIP_ROUTING 0x80 #dec 128
It seems like the search does not find any usages of the option. Could of course be in another project, though.
It seems to me from the comment in af.h that AF_SUPRESS_ROUTE_DISC_NETWORK should be used during joining only?
It seems to me from the comment in
af.h
thatAF_SUPRESS_ROUTE_DISC_NETWORK
should be used during joining only?
The only documentation is that single comment and from what I recall, these flags are processed by the closed-source portions of Z-Stack. My understanding is that it disables unnecessary unicast route discovery requests, since Z-Stack will be doing its own route discovery broadcasts.
There are discussions about the different approaches to routing and their use cases within the Z-Stack developer guide: Z-Stack 3.0 Developer's Guide.pdf
from Stack/af/af.c:
if ( options & AF_SUPRESS_ROUTE_DISC_NETWORK )
{
req.discoverRoute = DISC_ROUTE_INITIATE;
}
else
{
req.discoverRoute = AF_DataRequestDiscoverRoute;
}
from Stack/nwk/nl_mede.h:
// Route Discovery Options
#define DISC_ROUTE_NONE 0x00 // Don't discover route
#define DISC_ROUTE_NETWORK 0x01 // If a route is needed, the device (also
// intermediate router) will issue a route
// disc request.
#define DISC_ROUTE_INITIATE 0x04 // Only the source router initiates route req.
Also: _AFDataRequestDiscoverRoute seems to always be _DISC_ROUTENETWORK
So, I would read it this way: If the flag ist SET: Only the source router initiates route req. If the flag is NOT SET: If a route is needed, the device (also intermediate router) will issue a route disc request.
Not sure what to do with this information, though 😂
When using koenkk`s development firmware, that is built on SimpleLink SDK Version 6.10 or 6.20, the coordinator seems to freeze under certain conditions.
It seems like memory issues that causes the lock-ups, which might be triggered at high loads or simply after some time by chance.
I believe this might be an issue only since a month or two and, while I do believe the root cause is somewhere in the coordinator firmware, I think it was fairly recent changes in zigpy that started triggering the bug. This is why:
I have had the development firmware from Feb running since Feb without issues. Some time a month ago the issues started (I was very likely on zigpy dev) when I upgraded the firmware to the latest dev build from koenkk. Downgrading to Feb firmware did not fix the issue. I had to downgrade all the way to a SDK 5.x.x Version to have a stable environment again.
Have there been noticable changes in July (+/-)?
@puddly you commented on an issue I created here and mentioned RAM usage here .
Is there something that could be done within zigpy to reduce memory usage on the controller (without loss of function mentioned here )? Or do you, zigpy devs, believe this must be fixed in SimpleLink?
Thanks for your work, it's very much appreciated.