[RFC] Add a way to say that certain data is stored in system state only and not in the clixon database

This adds two things:

An option to not store any database on disk except the runtime database and startup database.

A way to tell clixon that certain data is not to be stored in the database, and instead will be fetched from the system state when required, and removed before storage to disk.

This should make handling security-sensitive data and data that can change dynamically under clixon easier.

This is just an RFC. I'm not too enamored with the "stateonly" name, but I haven't worked much on a better name. Also, I'm sure some of the things here could be done more efficiently and/or cleanly.

Another note on this. It would be more general, perhaps, to add callbacks in the plugin that would tie in at the same points (where the data is added from the system state and where the data is removed) and pass in the database name and cxobj tree. You would end up with the same code for the most part, just in different places, but that would let you do other things, perhaps. I don't know what other things, though.

Another note on this. It would be more general, perhaps, to add callbacks in the plugin that would tie in at the same points (where the data is added from the system state and where the data is removed) and pass in the database name and cxobj tree. You would end up with the same code for the most part, just in different places, but that would let you do other things, perhaps. I don't know what other things, though.

Actually, this really isn't feasible. You need to know, in the database, if the system state data is loaded into the XML tree. You don't want an operation to come in and operate on a database and then remove the system state data if the system state data was already in the tree for some other reason.

The docker-alpine-test-2 failure should be fixed by https://github.com/clicon/clixon/commit/9ee55441ac124d93dcff07638229428782e5ac6a please rebase

Some input/thoughts in no specific order:

1) I find the usage of "stateonly" confusing. In yang as described in https://datatracker.ietf.org/doc/html/rfc7950#section-4.2.3, state-only could refer to all non-config data:. But here, stateonly is rather config-data that is "sensitive" and should not be saved to persistent storage. Maybe "volatile" or ""hidden" is a better word?

2) A testcase and an example call to xmldb_add_stateonly() would help with understanding the API, or intended usage. I see setting CLIXON_TMPDB_VOLATILE=trueand populating thestateonly` cvec variable as the two mechanisms to configure this.

3) I question the stateonly code for edit-config and get-config for the candidate datastore (unless I misunderstand). It seems to me it gets/sets the state from the stateonly API. But candidate is (should not be) installed system state. That would break the candidate/commit semantics. In other words, setting or retrieving candidate data from system state is not intended usage of validate/commit. If candidate/running semantics is required the stateonly candidate data needs to be stored in-mem (or somewhere else?), but not sync it to disc. In this way the sensitive data is in-mem until a commit is made (later in time). The only alternative as I see it is having "auto-commit", ie, to commit candidate data immediately at every commit, thus disabling the lingering candidates. The existing option for this is CLICON_AUTOCOMMIT. This is default in RESTCONF BTW.

4) It seems to me CLIXON_TMPDB_VOLATILE is orthogonal to the stateonly code? This option ensures that the whole candidate datastore is not saved to disk. That could be a valid usecase, possibly for performance. But to me the state-only code is about ensuring specific sub-trees not being saved to disc which is different. One could change the xmldb_dump() and filter stateonly code there instead: keeping it in-mem but never stroing it to disc.

5) As an alternative to the stateonly cvec variable (which I have multiple issues with), I would recommend an extension on the YANG of the sensitive data. Example could be:

  leaf some_data {
     type string;
     cl:dont-save;
  }

In this way the data could be marked in a declarative way, otherwise it seems to me a more cumbersome function calls (eg in the start plugin call) needs to be made to make to xmldb_add_stateonly(). However, the callback function getstate cannot be declared in the YANG spec, so one would have to add a new plugin callback function (or multiple) to register and handle the stateonly data.

6) Finally, regarding the whole approach: the "sensitive" data while not saved to disc is still in-passing in-mem and in transit on UNIX socket etc. So, while the data is not stored to clixon datastores, it is still (briefly) accessible elsewhere. Is that OK?

Some input/thoughts in no specific order:

1. I find the usage of "stateonly" confusing. In yang as described in https://datatracker.ietf.org/doc/html/rfc7950#section-4.2.3, state-only could refer to all non-config data:. But here, stateonly is rather config-data that is "sensitive" and should not be saved to persistent storage. Maybe "volatile" or ""hidden" is a better word?

This is more than just sensitive data. This is data where the system state is the only database. For instance, the list of users is held in /etc/passwd, and /etc/shadow. You don't store any of that in the clixon database, every time you need it you load it from the system state.

I want this for a number of reasons:

I don't want to have to keep the startup database in sync with what's in /etc/passwd.
/etc/passwd can change if you load new software. I don't want to have keep the running database in sync with that.
The passwords would have to be stored in the database.
The code handling the stateonly data can scrub passwords when the data is fetched.

I agree that stateonly is not the optimal name. I just haven't come up with one. Maybe systemstateonly, but that's too long.

2. A testcase and an example call to  `xmldb_add_stateonly()` would help with understanding the API, or intended usage. I see setting CLIXON_TMPDB_VOLATILE=true`and populating the`stateonly` cvec variable as the two mechanisms to configure this.

I have this working in the python interface now with the ietf-system model. That's a bit complicated, though, as you have the python interface, the transaction framework, and then the implementation on top of the transaction framework. I agree that a simple use case and example would be ideal. I'll see what I can come up with.

3. I question the stateonly code for edit-config and get-config for the candidate datastore (unless I misunderstand). It seems to me it gets/sets the state from the stateonly API. But candidate is (should not be) installed system state. That would break the candidate/commit semantics. In other words, setting or retrieving candidate data from system state is not intended usage of validate/commit. If candidate/running semantics is required the stateonly candidate data needs to be stored in-mem (or somewhere else?), but not sync it to disc. In this way the sensitive data is in-mem until a commit is made (later in time). The only alternative as I see it is having "auto-commit", ie, to commit candidate data immediately at every commit, thus disabling the lingering candidates. The existing option for this is `CLICON_AUTOCOMMIT`. This is default in RESTCONF BTW.

The idea here is the system state is the only database for the portions of the tree that are stateonly. Except for the temporary time when a candidate is being worked on, when the system state is in the candidate, and an even shorter time when the system state is in the running database when doing the validate and commit processing. The sort-of-pseudo code for this is:

   client start a transaction
   backend loads the system state data into the candidate database
   client works on the candidate database
   client asks for a commit
   backend loads the system data data into the running database
   backend diffs the xml and does the backend plugin processing
   if successful
      remove the system state data from the candidate database
      copy the candidate database to the running database and save it
   else
      remove the system state data from the running database

This is exactly what I want. This way, the system state data only gets loaded right when it's needed, and gets removed after it is no longer needed. So if it changes between transactions, everything is still good.

Again, the main philosophy of clixon is that it is the master database and it make the system conform to its database. That's best for most data. However, for some cases, it's better to have the system data be the master database

One problem with the current implementation is the running database is loaded from the system state later than the candidate, so if something happened to change the system state between them, that change would only be in the running database. It might be better to save the system state data and apply it later to the running database. It may not be that important, tough.

4. It seems to me `CLIXON_TMPDB_VOLATILE` is orthogonal to the stateonly code? This option ensures that the whole candidate datastore is not saved to disk. That could be a valid usecase, possibly for performance. But to me the state-only code is about ensuring specific sub-trees not being saved to disc which is different. One could change the `xmldb_dump()` and filter stateonly code there instead: keeping it in-mem but never stroing it to disc.

It's mostly orthogonal, yes, and may be useful for other purposes. It just keeps the candidate database, which could hold keys and such while a client is working on it, off the disk. This is purely a security thing from my point of view.

5. As an alternative to the `stateonly` cvec variable (which I have multiple issues with), I would recommend an extension on the YANG of the sensitive data. Example could be:
  leaf some_data {
     type string;
     cl:dont-save;
  } 
In this way the data could be marked in a declarative way, otherwise it seems to me a more cumbersome function calls (eg in the start plugin call) needs to be made to make to xmldb_add_stateonly(). However, the callback function getstate cannot be declared in the YANG spec, so one would have to add a new plugin callback function (or multiple) to register and handle the stateonly data.

That was my original thought and what you suggested. I had some issues, though:

I don't think you can add the "cl:dont-save" with an augment. If you can't do that, then you have to hack on the original YANG code, which is non-optimal.
This is more implementation than specification. The client user of this model doesn't really care what happens here. The idea of YANG is that it's a contract between the client and the server, but this is not really part of the contract, and it shouldn't be part of the contract.
If I understand correctly, putting it in the YANG would be very inefficient. You would have to scan the entire database every time looking for these yang statements when you read or remove the subtree. Maybe I'm wrong, though.
You still need to register something somehow, though other interfaces register against things also in the YANG spec. To me that does seem like something easy to get wrong.
I would have had to figure out how to do this :). Not much of a reason, but I'll be honest and say this was simple to implement and it could be fixed later.

That was my reasoning, obviously from limited understanding.

6. Finally, regarding the whole approach: the "sensitive" data while not saved to disc is still in-passing in-mem and in transit on UNIX socket etc. So, while the data is not stored to clixon datastores, it is still (briefly) accessible elsewhere. Is that OK?

Yes, the data in transit will be protected with TLS/SSH, and the data will be in memory (though not on disk) temporarily. That's normal. It would be better if the memory was scrubbed after use. That would be relatively easy to add to the removal of the subtree, but then there's the buffers holding the transport data, both in the client interfaces and in the backend. I'm ok with not scrubbing at this point, and it's something that can be fixed later.

I like system-only-state better. But system-only-config would actually align even better to the RFC definition. It is configured data, just only saved in the system, not in the clixon datastore.
I understand that the full ietf-system model is too large as an example. Nevertheless, it needs a few simplified testcases for regression (when implemented).
OK, thanks for the explanation.
```
1.  client start a transaction
2. backend loads the system state data into the candidate database
3. client works on the candidate database # This period may be long
4. client asks for a commit
```
Two questions here. First, I assume then it is acceptable to have the client working with the candidate for an indefinite time in step 3 above. The client can choose (or multiple clients if the candidate is not locked) to send a series of edit-configs before the commit. Second, Step 2 may load info that has been changed in the system (not via clixon/NETCONF). This moves the "source-of-truth" from NETCONF/clixon to the system. I dont think this is wrong but could lead to unexpected side-effects. I know for example there are drafts to introduce a hash/signature of the (running) datastore in order to be able to determine any local changes which is used for remote controller applications. We had some problems with that with Juniper systems that regularly rewrites some fields, such as passwd, userid, etc, which means one needs to poll the complete config to detect any changes for a (remote) diff algorithm to work.
Yes, but an implementation which instead of syncing nothing to disc, could sync everything except the stateonly data would also meet the objectives. Again, I see CLIXON_TMPDB_VOLATILE useful in itself but not necessarily in the context of stateonly data.
I don't think you can add the "cl:dont-save" with an augment. Yes you can. See example here: https://clixon-docs.readthedocs.io/en/latest/yang.html#extensions
```
augment "/if:interfaces/if:interface"{
exl:mymode "my-interface";
```
This is more implementation than specification This is true. However, its used a lot, and its an easier way to annotate the spec. One can bypass it by using augments in a top-level local server-only YANG that imports the standard YANGs. If I understand correctly, putting it in the YANG would be very inefficient. You would have to scan the entire database No, there is a direct pointer xml_spec(x) from XML to YANG-spec, it is not inefficient.

As you say, one still needs to do a (functional) registration and write code to read/write stateonly data. But in my view it is easier to maintain across updates etc, if the dependencies are stated in direct association with the YANG. Suppose the YANG revision changes and with it some of the stateonly fields. In functional code one needs to keep track of revisions with conditionals. In YANG-annotated code this is done inside the new YANG (or an upgraded server YANG). But I agree with your point above that this is (server-side) implementation info.

I like system-only-state better. But system-only-config would actually align even better to the RFC definition. It is configured data, just only saved in the system, not in the clixon datastore.

Ok, system-only-config is better, I agree. Would sysonly_config be good?

2. I understand that the full ietf-system model is too large as an example. Nevertheless, it needs a few simplified testcases for regression (when implemented).

I'll see what I can do on this. All my work has been in python, so I'll have to do something in C.

  1.  client start a transaction
   2. backend loads the system state data into the candidate database
   3. client works on the candidate database # This period may be long
   4. client asks for a commit
Two questions here. First, I assume then it is acceptable to have the client working with the candidate for an indefinite time in step 3 above. The client can choose (or multiple clients if the candidate is not locked) to send a series of edit-configs before the commit.

Correct, that is expected.

Second, Step 2 may load info that has been changed in the system (not via clixon/NETCONF). This moves the "source-of-truth" from NETCONF/clixon to the system. I dont think this is wrong but could lead to unexpected side-effects. I know for example there are drafts to introduce a hash/signature of the (running) datastore in order to be able to determine any local changes which is used for remote controller applications. We had some problems with that with Juniper systems that regularly rewrites some fields, such as passwd, userid, etc, which means one needs to poll the complete config to detect any changes for a (remote) diff algorithm to work.

Yes. You wouldn't want to use this for data that is very large. Essentially, you are polling it every time you do a transaction. And there may be surprising side effects.

4. Yes, but an implementation which instead of syncing nothing to disc, could sync everything _except_ the stateonly data would also meet the objectives. Again, I see `CLIXON_TMPDB_VOLATILE` useful in itself but not necessarily in the context of stateonly data.

I considered this, and I think I tried it, but I ran into issues. In edit-config, you need to know if the current system config is loaded into the candidate database, so you need some flag for that. Since you want to re-read the system config every time you start a transaction, it was easier to remove it when the transaction is done and when the next edit-config happens for a new transaction to read it back in.

I don't think you can add the "cl:dont-save" with an augment. Yes you can. See example here: https://clixon-docs.readthedocs.io/en/latest/yang.html#extensions
augment "/if:interfaces/if:interface"{
    exl:mymode "my-interface";

Ah, good.

This is more implementation than specification This is true. However, its used a lot, and its an easier way to annotate the spec. One can bypass it by using augments in a top-level local server-only YANG that imports the standard YANGs. If I understand correctly, putting it in the YANG would be very inefficient. You would have to scan the entire database No, there is a direct pointer xml_spec(x) from XML to YANG-spec, it is not inefficient.

If you implemented it in db_dump, then it would be efficient on the write side. But if not, then the remove operation would have to scan the entire database every time to find all these items and remove them. And in either case to read the new values back in it has to scan the entire database to find them. On the read side, it has to then locate and call the registered code to read that piece of data in. I don't see another way.

As you say, one still needs to do a (functional) registration and write code to read/write stateonly data. But in my view it is easier to maintain across updates etc, if the dependencies are stated in direct association with the YANG. Suppose the YANG revision changes and with it some of the stateonly fields. In functional code one needs to keep track of revisions with conditionals. In YANG-annotated code this is done inside the new YANG (or an upgraded server YANG). But I agree with your point above that this is (server-side) implementation info.

I am not seeing how annotating it in the YANG makes maintenance any easier. If the YANG version changes the system config only data, you would still need conditionals to know how to handle the data based on the version, or keep the code in sync with the version. If you can somehow change things in the YANG and don't have to change things in the functional side, that would be cool, but I don't see how. And if there's no efficiency issues, that would be fine.

clicon / clixon

[RFC] Add a way to say that certain data is stored in system state only and not in the clixon database #534