fairecasoimeme / ZiGate

Zigate is an Universal Zigbee Gateway
http://zigate.fr
171 stars 59 forks source link

Implement message in JN-AN-1216 Appendix B.4 #280

Closed schrodingersket closed 4 years ago

schrodingersket commented 4 years ago

This issue is to track work done in porting messages specified in JN-AN-1216 Appendix B.4 "Exporting Persistent Data to Host" from the JN-AN-1223 source to JN-AN-1216. This allows the JN5168 coordinator to delegate PDM capabilities to the coordinator host OS such that the current EEPROM memory restrictions posed by the JN5168 can be overcome by storing EEPROM data externally.

schrodingersket commented 4 years ago

What's the best way to document the new commands? They unfortunately do not correlate exactly to those specified in JN-AN-1216, as the message specification from that app note was written for the implementation in JN-AN-1223, which contains a different set of APIs for working with the PDM.

I have them all written down, but I suspect my chicken scratch notes probably aren't what we want (:

schrodingersket commented 4 years ago

I've now fully tested externally-managed EEPROM against this pull request - seems to be functioning nominally. I'm probably going to test out the device counts within the week to try to push up to more around ~200.

pipiche38 commented 4 years ago

Yesterday I was looking after OTA and I found that OTA migh use the EEPROM as well

schrodingersket commented 4 years ago

Yesterday I was looking after OTA and I found that OTA migh use the EEPROM as well

That wouldn't surprise me. This may help with that, since it opens up a theoretical memory limit of some ~4GB if I'm not mistaken - the main limit with this functionality will be RAM.

fairecasoimeme commented 4 years ago

Hi, Great ! I'll test as soon as possible. Thanks for all

KiwiHC16 commented 4 years ago

That s very good. One limitation that I identified Is related to the TC. I never found how the MAC addresses and associated tables are stored and if we could put them into PDMs. This could limit the capacity.

schrodingersket commented 4 years ago

@KiwiHC16 Here's what I used to update the Trust Center count, as well as the address table size for other MAC address storage locations:

image

image

image

Does that address your concern about where those values might be set? I haven't found any other locations, though the field type used to keep track of device count is a uint8, so I suspect that unless that's mutable (e.g., not hard-coded into the compiled SDK), 255 may be the ZiGate's ultimate device limit. I'll definitely be doing more research on that down the road though... technically, I believe the theoretical limit should be 65535 - 1 (since the coordinator is always 0x0000 devices, since the device short addres is a uint16. Feel free to correct me if I'm wrong on any of those counts though (:

fairecasoimeme commented 4 years ago

Hi, On the host side, what did you use ? the SerialLink.py script with pdm.db ? or other things ?

KiwiHC16 commented 4 years ago

I agree with your comments. By changing those, I understand that the table will be increased but will remain in the SRAM/EEPROM of the JN5168. What I was looking for in the past was to move these tables into external EEPROM or PC Client using PDM, which I never managed. But if you managed to move a bug trunck of tables to the external EEPROM then keeping TC's tables in the JN5168 is not an issue anymore.

fairecasoimeme commented 4 years ago

Currently, we are also working to move some tables to an external flash memeory rather than the EEPROM. with all that, we should be able to really increase the number of devices and also add new functions.

pipiche38 commented 4 years ago

Do you plab also to add a specific API to allow backup/restore ?

schrodingersket commented 4 years ago

Hi, On the host side, what did you use ? the SerialLink.py script with pdm.db ? or other things ?

@fairecasoimeme I have a separate Python library I wrote a while ago against the original JN-AN-1216 that I tested this with.

@KiwiHC16 - this pull request allows us to hold all of the tables managed by the PDM (including TC tables) on the host, instead of the 5168 EEPROM. In my test implementation, the entire EEPROM is a JSON file, so to answer @pipiche38 , I'm backing up and restoring by managing that JSON file.

The main thing I'm working on now is handling the restoration of RAM from the external PDM in large networks, which takes longer now that it's over a serial link. If the network is of sufficient size, I believe the 5168 has a watchdog that triggers a chip reset if a particular function isn't reached within a certain period of time (for instance, if the RAM load fails for whatever reason).

Finally, if anyone is interested in migrating from internal EEPROM to host-managed EEPROM, it should be a fairly trivial matter to simply call the EEPROM PDM API for the read functions listed here, and the functions implemented in this pull request for PDM write calls; that would essentially allow you to export the internal EEPROM to the host in whatever format you wish instead of the binary format forced by the Production Flash Programmer.

fairecasoimeme commented 4 years ago

Could you share your python library that I can test rapidly in aim to validate the update? Thanks

schrodingersket commented 4 years ago

Unfortunately, I can't due to contractual restrictions on my end (we developed the library in-house, so it's not already-existing open source code, but I'm looking to try to open source it down the road). I can probably drum up something via SerialLink, create an Excel sheet (or modifying the one in this repo) with the message format, or modify an existing library if you have an open source one you could point me to (:

schrodingersket commented 4 years ago

@KiwiHC16 Just to follow up, from JN-UG-3116, the following information is what is stored in EEPROM, and thus is what can be now managed externally:

The stack context data which is stored in EEPROM includes the following:

  • Application layer data:

    • AIB members, such as the EPID and ZDO state
    • Group Address table
    • Binding table
    • Application key-pair descriptor
    • Trust Centre device table
  • Network layer data:

    • NIB members, such as PAN ID and radio channel
    • Neighbour table
    • Network keys
    • Address Map table
schrodingersket commented 4 years ago

Watchdog timer reset has been implemented in 44fc764. Successfully tested with a load of 75 devices into EEPROM. Took a while to load over serial (understandably) at 115200 baud, so it's worth considering that support for a larger network via external EEPROM will incur a longer startup time.

fairecasoimeme commented 4 years ago

@schrodingersket no problem, it was to earn times. I will reuse SerialLink.py to adapt the protocol to the firmware. I'm not a Python expert but I think that will be ok. Thanks

schrodingersket commented 4 years ago

@fairecasoimeme For sure! I work in Python nearly every day, so I'm happy to help where I can. I don't know the current C ecosystem terribly well since my day-to-day work usually requires higher-level languages (though I'm fairly comfortable with theory and such), else I'd add some abstraction and unit testing to this PR. If you have suggestions for a test suite/harness, I'd be happy to add that.

fairecasoimeme commented 4 years ago

@schrodingersket could you send to me (contact[at]zigate.fr) a doc of your new commands ? even if it's not uptodate. So, in the doc, could you precise the different protocol exchanges etc ...

PS: I'll share the doc on github

Thanks for all.

fairecasoimeme commented 4 years ago

Hi all.

I integrated the update and i tested with a modified version of ZWGUI. That's work pretty good but I can't go over 90 child table and 140 max devices. If I put over, the JN5168 crashes. I think the limit is the RAM memory. I'll continue to investigate and prepare the code to add for plugins

Fred

pipiche38 commented 4 years ago

Tryi,ng to document a bit the API in order to get it more easy for integration. Please let me/us know if that is wrong ?

Screenshot 2020-03-28 at 15 18 35
KiwiHC16 commented 4 years ago

Looks very very promising. Unfortunately not so much time on my side to look at it ;-(

schrodingersket commented 4 years ago

@pipiche38 - that looks correct to me. The only things missing would be the create/increment /delete bitmap functions, which do exactly as they sound and just map a memory address to a 32-bit field. Apologies for the delay; thing have been pretty crazy over here. Hope you are all staying safe!

pipiche38 commented 4 years ago

@schrodingersket thanks for the feedbacks. In order to get something more manageable I have put it under GitHub as well. https://github.com/pipiche38/Domoticz-Zigate-Wiki/blob/master/en-eng/Technical/ZiGate-PDMonHost.md

As you seems experienced in the NXP stack, I have a question in regards to the Get and Increment Bitmap.

While IncrementBitmap stands for

Retreive, Increment, Store Finaly Send the new value,

GetBitmap seems to be:

Retreive, Increment Finaly Send the new value

(the value doesn't change on the PDM). Correct ?

schrodingersket commented 4 years ago

That's correct on IncrementBitmap - GetBitmap should not increment though. It should just retrieve and send the value.

pipiche38 commented 4 years ago

Here is what in the NXP documentation for GetBitMap: The function returns the counter’s start value (from the counter’s header) and incremental value (from the counter’s bitmap).

So for me it does return an Incriment but do not store it (which is the difference with IncrementBitmpa

schrodingersket commented 4 years ago

Right - here's how I read that, based on the way that the counter works: when a bitmap is stored for the first time, it's initialized with a possibly non-zero value (the counter's header/start value). A record of that initial value is stored on the host separate from the current value (the "incremented" value from the NXP doc), which is the initial value plus however many times the counter has been incremented. When GetBitmap is called, it returns both the initial value and the current value (i.e., the counter's header, and the incremented value, respectively).

schrodingersket commented 4 years ago

Just wanted to follow up on this - looks like this capability was added to the v3.1d branch - can this be closed?