zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.54k stars 6.46k forks source link

Proposal to improve the settings subsystem #12331

Closed fnde-ot closed 4 years ago

fnde-ot commented 5 years ago

Here is a proposal to improve the settings subsystem, I would like to make a PR but I'd prefer to know beforehand if it is something that looks acceptable. It is inspired by the proposal mentioned in issue #8854.

Introduction

Problem description

The current settings subsystem uses key-value settings where a key is a “path”-like string (fx. “bt/keys/aabbccddee0/1”) and binary values are stored as base64 encoded strings. Settings modules (fx. bt_settings) register handlers that are responsible for encoding/decoding keys and values. When loading from permanent storage, the key string is parsed to select the appropriate module handlers to set values in memory.

We would like to be able to:

Proposed change

The suggestion would be to make the following changes to the settings subsystem:

  1. Add module-specific storage handlers. These storage functions would be responsible of the encoding/decoding of binary values and perform I/O. The same storage handlers could be used for multiple modules and could be registered from the application or other Zephyr component.

  2. Use module-specific integer keys. Each module (fx. bt_keys, bt_mesh, ...) would register its own handlers to the settings subsystem and manage its own key space. Integer keys would remove the need to do key string parsing and make it easier and more efficient for custom storage backends to manage. Information previously stored in the key (fx. bt address, ...) can be stored in the binary value instead.

Detailed RFC

Proposed change (Detailed)

A setting would be a key-value pair with:

The settings subsystem would contain a list of modules containing the following:

The existing API (settings_load, settings_save_one, ...) could mostly be maintained with minor changes to parameters:

With added functions to register a module and its storage handler independently:

The internal flow could look similar to this:

settings_save(): // for each module:

settings_load(): // for each module:

settings_save_one(module, key):

settings_load_one(module, key):

The existing I/O backends (file, flash) and encodings (base64) can easily be ported to that new setup to maintain backward compatibility of already stored data.

Dependencies

Those API changes would require changes to the existing Zephyr settings modules (bluetooth host layer) and could affect user applications that use the existing settings subsystem for custom settings storage.

Alternatives

An alternative would be to create an alternative settings subsystem to choose from using Kconfig parameters, but this would increase the complexity of the BT host layer that would need to support multiple settings subsystems.

aescolar commented 5 years ago

CC @jhedberg @carlescufi

jhedberg commented 5 years ago

@nvlsianpu FYI

jhedberg commented 5 years ago

Doesn't sound too bad of a proposal to me (admittedly though I didn't spend much time thinking about this beyond reading the proposal). One drawback this has is that it's no-longer possible to store extended information in keys, like we had e.g. with “bt/keys/aabbccddee0/1”: the aabbccddee0 there is a Bluetooth LE address of a remote device, and the last component (1) is the local identity that this relates to. With integer keys we'd need an extra key-value that does mapping from Bluetooth addresses to integer keys. Something similar is needed for the multi-identity support, so the new design would complicate things in this respect.

Laczen commented 5 years ago

Hi @fnde-ot, I agree with the first two points for the improvement, I am unsure about the last one. It would reduce the flexibility.

I have been studying the settings module lately and I think it is quite good. Where I do see improvement possibility is to create a better separation between the settings part and the storage. At the moment there is some intermediate level (a line representation) that is shared by fcb and nffs and this makes things complicated. It might be better to move this line representation to the backend. This is further complicated by the encoding of binary data. This encoding should always be used in case of nffs (otherwise you might find a wrong line end), but is not needed for fcb (as it knows the data length). A different data representation in nffs (linelength:name:value) could eliminate the need for encoding.

Making a better separation would allow easier integration of different backends. The runtime interface could also be considered as a backend and should not require its own set, ... routines.

There is another area where the separation between the settings part and the backends could be improved: when calling the settings_save_one() it is checked if the data in storage is the same as the data being stored to avoid storing the same data again. This would better be done in the backend.

The reworked version has reduced the stack usage a lot by providing the set routines with a method to read the data directly from storage this should definitely be kept.

In #12160 I have been assigned the task to add a nvs backend. I am working on this, but different ideas are always welcome.

fnde-ot commented 5 years ago
      Doesn't sound too bad of a proposal to me (admittedly though I didn't spend much time thinking about this beyond reading the proposal). One drawback this has is that it's no-longer possible to store extended information in keys, like we had e.g. with “bt/keys/aabbccddee0/1”: the aabbccddee0 there is a Bluetooth LE address of a remote device, and the last component (1) is the local identity that this relates to. With integer keys we'd need an extra key-value that does mapping from Bluetooth addresses to integer keys. Something similar is needed for the multi-identity support, so the new design would complicate things in this respect.

Hi @jhedberg,

I agree that the key change is debatable, but I think moving the extended information should be moved from key to value could help simplify the process quite a bit and save a lot of cycles.

For example when loading 'bt/mesh/AppKey/1' we go through many hoops:

  1. [settings] load from backend
  2. [settings] parse first part of key to look for handler
  3. [bluetooth host set()] parse second part of key to look for handler
  4. [bluetooth host mesh_set()] parse third part of key to look for handler
  5. [bluetooth host mesh app_key_set()] parse rest of key to get the address and set in memory

When could just register the mesh app_key module directly and do:

  1. [settings] load from backend and parse module name to look for handler
  2. [bluetooth host mesh app_key_set()] set in memory

The integer keys could be enum values, fx: BT_SETTING_ID, BT_SETTING_IRK, ..., or array index (fx. bt keys), and information stored in the value can always be used by the setting module if additional checks are needed (fx. bluetooth address).

It would also remove dependency with the printf/printk family that are used for key encoding.

fnde-ot commented 5 years ago

Hi @Laczen,

It looks like we fully agree on the need to move the encoding and other storage specific matters from setting modules to the storage backends. However this proposal would also allow using different storage ways for different modules.

One example for settings that change often is that one could prefer to buffer changes and store those to flash every few seconds. Those module-specific storage handlers could easily reuse all existing backends (nffs, fcb and nvs) with only minor changes.

Do you have a pull request for the NVS backend work?

nvlsianpu commented 5 years ago

I have to consider what is proposed. But firs some corrections - currently base64 encoding is not in use (it is not fully deprecated - it is working but is not enabled for either fcb and nffs. Regards replacement of string by module handlers + UUID - looks like it is stil complicated. I mean that there will be need to keep some references between UUID and values (a code book?). It will be hard especial for the dynamic one (like @jhedberg mention).

Laczen commented 5 years ago

@nvlsianpu, is encoding disabled for nffs? How does nffs distinguish binary data containing a EOL and a EOL at the end of a line? Have you tried storing data that contains a EOL?

nvlsianpu commented 5 years ago

@Laczen Record format is different for File back-end <u16: line len(without this field)><string: key-name><u8[]: raw data> This len field is enabled thanks to SETTINGS_ENCODE_LEN Kconfig selection.

Laczen commented 5 years ago

@nvlsianpu, OK, I didn't realize this was changed.

fnde-ot commented 5 years ago

@nvlsianpu Thanks for the info, I had missed that base64 is now optional. The important part of this proposal would be to decouple the encoding/storage handling from the in-memory handling, and allow registering custom storage handlers. Those could be module specific (as described above), or there could be a single one if we want to simplify the above proposal.

The modules would simply need to implement get and set functions, that would allow for added features like settings_load_one() or settings_save() (previously requiring an additional "export" function).

In the same way, storage handler would only need to implement a load and save functions, equivalent to the modules get and set, but applying to storage instead of memory.

As for integer keys, it would be a nice change to avoid the use of strings for embedded systems with cycle/memory/storage limitations, as it comes with pretty big overhead, amongst others:

Those keys would be module-specific, so it should be pretty easy to switch to integers in all the potential settings modules:

nvlsianpu commented 5 years ago

@fnde-ot Thank for elaboration.

Question of Sens of Adding module-specific storage handlers: What is the reason behind this - especially when a few modules settings are stored in one storage system. At last for FCB it is required to treat module-unknown (form the module view) records carefully. Of course I can imagine that for File back-end we can migrate to use one file per each module. Do you have a case when you need different back-end simultaneously? Or it is just for open settings for such a possibility.

I understand reason behind idea for replacing strings-key by integer-key. But i think this is not enought. I think it should be integerS-key instead: How to menage integer keys? lets assume we store all key-value pairs in one storage (so like for FCB presumably). I am thinking about what will land in physical record in the storage. So at last i see need for having apart from module-specific key integer, the module ID integer as well. So globally it will be required to menage only module ID integers, module-specific key integer will be in module care. Imrealy want to avoid managing all key globally.

Laczen commented 5 years ago

@fnde-ot, @nvlsianpu Regarding replacing the strings-key by integer(s)-key I agree with @nvlsianpu. It seems the complexity of some kind of key encoding is needed anyhow. In the proposal it is just moved from the settings part to the backend part. Anyhow the settings module already supports a integer(s)-key method: just register the name as "0000" and then you can store as "0000/0000" ... "0000/FFFF", when your keys are not generated by calculation you don't even need the printk() routines. The storage requirement is not so much longer compared to using a real integer key.

fnde-ot commented 5 years ago

@nvlsianpu, @Laczen I actually have a case for module-specific storage handlers, where bluetooth keys have to be stored in a different way from the rest of the bluetooth settings. Having said that, as long as we can easily register a custom storage handler, it could also be global one that can would act differently for each module if needed (given its name, see below).

In this proposal modules still have a "name" string, so that even with a module-specific storage handler one could use the backend to store keys and values in a single place/file by appending the module name as part of the key (as done today with "bt/keys/xxxx", "bt/keys" being the module name). Modules could also be identified with an integer ID instead of a name string, but as you mention it is harder to manage globally compared with is done today unless we reserve a range of IDs for Zephyr's purposes that should not be used for other custom settings. A custom storage handler is also free to encode module names to integer before calling the I/O backend.

nvlsianpu commented 5 years ago

global module ID shouldn't be so hard to be menage I think.

So in your proposition module ID are not a case while storing any bt_key. But for what we already have in zephyr it is still a case.

fnde-ot commented 5 years ago

I agree that if we go with moving keys to integer, the modules should also have an integer ID instead of a name. It should be okay to just reserve a range for Zephyr's internal usage, just as we now reserve strings that cannot be used for custom application settings (fx. "bt", "bt/keys", …).

nvlsianpu commented 5 years ago

We are almost on the same line.

So assuming that string naming would be replaced by integer naming: Currently we can create quiet deep hierarchy (sub-tree) of key - it is expected that a functionality for operation on any level of sub-tree will be added. Example is deleting all key-value pairs for certain configuration part. BLE is using up to 3rd level of sub-tree: main/sub-A/* main/sub-B/*

Then we would call (jet not implemented) delete call for main/sub-A for deleting all data under sub-A or for main/sub-B for deleting all data under sub-B or for main for deleting all data under main so main/sub-A and main/sub-B Idea behind this is that hierarchy is usefully here. It helps to organize things, and also isolate configuration abstractions from each other.

So I think we shouldn't limit hierarchy to level of 1. This mean we need provide flexibility for replacing string hierarchy by integers-ID hierarchy.

When settings would be stored - the 1st - main id integer should be by module. If storage backed would be designed to store only one type of settings it might ignore this id. Rest of sub-tree integers should be transferred by the API call.

@carlescufi @jhedberg @Laczen Any thought on settings hierarchy requirements?

Laczen commented 5 years ago

@fnde-ot, I don't think it is a good idea to go with moving keys to integer as standard. This reduces flexibility. As it is the settings module can already be used with integers as I proposed above. This is just how you use the settings module. @fnde-ot, I don't understand you use case with a module-specific storage handler. Could you elaborate it a bit more ? What exactly are you trying to do ? Please explain what you mean by storage handler, is this the backend or is it a settings handler ? I am working on the settings module to eliminate the intermediate line record format, and to call the set() routine directly from the backend, this might suit your needs perfectly.

Laczen commented 5 years ago

@nvlsianpu, I have been working on the settings module that provides this functionality exactly. The save method is extended to allow a delete of all sub keys. It takes the key "/main/sub-B" and would delete all keys that start with "/main/sub-B" (it also takes "/main/s" and would then delete "/main/sub-A", "/main/sub-B", ...). This is done at the backend level. It already works but I am still performing tests.

fnde-ot commented 5 years ago

Thank you both for the feedback!

@nvlsianpu, Eliminating hierarchy would significantly reduce complexity and memory/cycle usage, but it of course depends on the actual requirements. If we are fine with limited depth, it could also be coded as part of an integer module ID, and have each module handle its own reserved space for potential submodules, for example: 0x8000 - 0xFFFF: Zephyr modules 0x8000 - 0x80FF: Zephyr Bluetooth modules (equivalent to "bt/") 0x8080 - 0x808F: Zephyr Bluetooth Mesh modules (equivalent to "bt/mesh/") ...

@Laczen, I agree that the current hierarchical key system gives additional flexibility that could be needed for future features, but in my opinion it comes at a significant cost in storage space, processing power and memory. Your suggestion of storing integer as strings is a nice improvement, but it takes 2x the space and the module name/id is still stored for every single setting, which is not very optimal.

By storage handler I mean functions that would handle encoding and storage, which could be any of the existing backends, or a middle layer using one or multiple storage backends.

In my use-case for example, bt_keys settings are stored by an external system with very limited storage space, that directly fetches changed settings from memory over a slow bus. When there is a change in a setting our storage handler needs to buffer it and send a signal to the other system at most once every few seconds. Processing power is also limited so I need to avoid any extra step, like encoding binary data to string-keys (usually bigger than the actual value to be stored) and decoding back to binary for storage. I know it might not be the typical setup, but many other embedded systems also have strong storage/memory/cycle constraints that we should take into account.

Laczen commented 5 years ago

@fnde-ot, I think with this kind of scheme you will run into problems quickly. As an example lets look at the replay protection list of btmesh. This list contains the id as well as the last received message number from a node. To store this you could use as an id the entry in the rpl list. What are the id's you are going to reserve, one application might have a rpl list of only 4 elements, another might use 16. For each id in your proposal you will need to carefully select this. If you wish to reuse your settings after updates you will need to stick with them or make some kind of "upgrade" method for the stored data.

All that said, It seems you are not interested in the complete settings but only a part. There is also a runtime interface to get/set variables. Using this interface you should be able to store in whatever format you want.

fnde-ot commented 5 years ago

@Laczen, The hierarchical thing would be for module ID only, and we are talking about less than 10 modules for current Zephyr. Each module such as "bt/mesh/RPL" would still have its own key space for its settings (a u32_t would allow for more than 4 million entries).

The main goal of this proposal is to move encoding of keys and values to the storage part to give more flexibility in storage choices, which would also simplify the way this subsystem works. It looks like we are aligned on some of the simplification objectives, your work with removing the settings_line layer certainly goes in the right direction.

nvlsianpu commented 5 years ago

@nvlsianpu, Eliminating hierarchy would significantly reduce complexity and memory/cycle usage, but it of course depends on the actual requirements. If we are fine with limited depth, it could also be coded as part of an integer module ID, and have each module handle its own reserved space for potential submodules, for example: 0x8000 - 0xFFFF: Zephyr modules 0x8000 - 0x80FF: Zephyr Bluetooth modules (equivalent to "bt/") 0x8080 - 0x808F: Zephyr Bluetooth Mesh modules (equivalent to "bt/mesh/") ...

^^ What is described her is exactly what I want to avoid. I was thinking that dedicated storage you want to us might impose limitation on hierarchy for stored settings.

fnde-ot commented 5 years ago

If we then maintain module names as strings ("bt", "bt_keys", "bt_mesh", "bt_mesh_rpl", ...) as initialy proposed, we can use the feature that @Laczen is working on to be able to remove settings from all modules with a name starting by a specific string (fx. "bt_mesh*" would remove all settings from modules with names starting with "bt_mesh"). It also does not produce significant overhead, as we only have very few modules.

aescolar commented 4 years ago

Closing. Things have changed so much that it is not relevant to keep this issue open as written.