CESNET / rousette

RESTCONF server for sysrepo
https://gerrit.cesnet.cz/q/project:CzechLight/rousette
Apache License 2.0
7 stars 2 forks source link

Automatic copying of changes to `startup` #6

Open jktjkt opened 5 months ago

jktjkt commented 5 months ago

The standard says that:

If the NETCONF server supports :startup, the RESTCONF server MUST automatically update the non-volatile startup configuration datastore, after the "running" datastore has been altered as a consequence of a RESTCONF edit operation.

This essentially says that the changes which were done to the "default datastore" (when the NMDA URL paths are not used), should be propagated to startup as well. It will be fun to come up with all the rules for potential conflict solving when startup and running have diverged from each other.

troglobit commented 4 months ago

I just ran into this, and have been trying to get restconf/operations/ietf-netconf:copy-config to work in the meantime (to no avail so far), but maybe that's the easiest way forward also for this issue?

curl --http2-prior-knowledge -X POST -H "Content-Type: application/yang-data+xml" -H "Accept: application/yang-data+xml" -d '<ietf-netconf:copy-config xmlns:ietf-netconf="urn:ietf:params:xml:ns:netconf:base:1.0">
  <source>
    <running/>
  </source>
  <target>
    <startup/>
  </target>
</ietf-netconf:copy-config>' "http://localhost:10080/restconf/operations/ietf-netconf:copy-config" -u admin:admin
jktjkt commented 4 months ago

I forgot that there's no <copy-config> copying between datastores in RESTCONF. The ietf-netconf:copy-config is a NETCONF-level operation which is handled by Netopeer2. Now, Netopeer2 does create a normal sysrepo-level RPC subscription, which means that one can invoke that RPC over RESTCONF already. However, there's some special filtering in Netopeer2 which means that any such RPC that has not originated through the NETCONF server gets rejected.

@troglobit, the problem is (IMHO) "hard" because it's a tricky question on how to solve potential conflicts. Sure, we could blindly do a copy from running to startup upon any change internally within rousette (that would be a oneliner to session.copyConfig(...)), but then there's a risk of overwriting stuff which was not supposed to be overwritten. Or, one can try to re-apply the same edit operation. However, in that case, what should happen when there's a failure in the second DS? In other words, we're ignoring this problem for now and are focusing on implementing the other features of the protocol.

@michalvasko: do you feel like relaxing np_ignore_rpc to also accept stuff like <copy-config> over RESTCONF?

michalvasko commented 4 months ago

@jktjkt Not really and what is more, the plan is to change netopeer2 RPCs to work as they did originally, outside of sysrepo. The reason it was changed is so they can be subscribed by other applications, which are then at least notified when the RPC is executed. A notification can be generated in this case, which keeps the exact same functionality and will make netopeer2 simpler and faster.

troglobit commented 4 months ago

I forgot that there's no <copy-config> copying between datastores in RESTCONF. The ietf-netconf:copy-config is a NETCONF-level operation which is handled by Netopeer2. Now, Netopeer2 does create a normal sysrepo-level RPC subscription, which means that one can invoke that RPC over RESTCONF already. However, there's some special filtering in Netopeer2 which means that any such RPC that has not originated through the NETCONF server gets rejected.

Aha, I see. Did not know about that filtering thing, would've taken me a while to figure out. Thanks!

@troglobit, the problem is (IMHO) "hard" because it's a tricky question on how to solve potential conflicts. Sure, we could blindly do a copy from running to startup upon any change internally within rousette (that would be a oneliner to session.copyConfig(...)), but then there's a risk of overwriting stuff which was not supposed to be overwritten. Or, one can try to re-apply the same edit operation. However, in that case, what should happen when there's a failure in the second DS? In other words, we're ignoring this problem for now and are focusing on implementing the other features of the protocol.

Well, I don't claim to understand the complexities involved since I have very little insight into the protocols in play here. But I've worked a bit with other systems with similar concepts and from there the only real active configuration is the running-config, so on those systems, saving to startup-config is just a way of making sure the state is kept for next reboot. But I probably don't understand the problem fully yet, and I appreciate you all working on other more important parts of rousette! <3 We can do a local patch here in the meantime.

Cheers!

jktjkt commented 4 months ago

Under the NMDA, the startup and running are two "unrelated" datastores. Now, suppose that there's a YANG model like this:

container transmission {
  leaf mode {
    type enumeration {
      enum "auto";
      enum "manual";
    }
  }
  leaf current-gear {
    when "mode = manual";
    type int8 {
      range "1 .. 6";
    }
  }
}

What happens if startup has transmission/mode set to auto, and the running DS has transmission/mode set to manual, and the RESTCONF server is handling an edit which sets transmission/current-gear to 3? Sending "just this edit" to startup as well will surely fail. One could simply copy the config, but how much of the config? Just this module? That might break data integrity. So do we copy the entire DS content? We could, but then is is expected that copying, say, this change of gear also copies all unrelated changed which have been done recently by other clients, like, say, a tweak in a logging level somewhere, or this temporary SSH pubkey for root? And is that expected that this happens even if the server advertises full NMDA support, where the entire point is better control over how things are made persistent?

Honestly, I don't know what a good fix is.

jktjkt commented 4 months ago

Also, as per https://github.com/netconf-wg/restconf-next/issues/16 it seems to me that this behavior might have been considered obsolete, or at least it might change in a future version of the standard. Maybe it's time to document our deviating behavior and call it a day (after a proper discussion on a mailing list, of course).