Tracking 'uses' for Entries

robshakir commented 5 years ago

Originally posted by @andaru in https://github.com/openconfig/goyang/pull/95#issuecomment-480939133

all: What was the original problem which led to these uses statements being tracked? Was the problem related to identifier namespaces or other identifier resolution issues?

If so, it should be that we can avoid back-propagating the pointers to the uses statements by tracking those in the "forward" direction as we traverse the populated model.

robshakir commented 5 years ago

Taken from #95:

The issue is supporting the uses .... { when ... } and augment ... { when ... } statements. When we convert to entries today, we lose any context as to which grouping or augment statement contained any entry to start with.

An alternative to tracking this per parent Entry (what was implemented here) is for us to rather track these in the child entries -- however, this would then be separate to where we store the contents of the when statement. We essentially would need some new way to describe these conditions - and then subsequently validate them.

LeonGWang commented 5 years ago

Sorry I don't quite understand what is meant by tracking in the "forward" direction.

As for the tracking Uses statement in the child entries, what would be the advantage for this approach?

robshakir commented 5 years ago

On the latter, the reason to do this on the child entries is we know whether these are subject to when conditions -- we could use this information to track grouping or augment provenance only in the case that the Entry is subject to a when clause. This would mean that most entries wouldn't have anything to do with their provenance stored - reducing this issue.

andaru commented 5 years ago

Thanks Rob, yeah that is what I was referring to. "forward" is a poor choice of words sorry Leon.

To achieve what Rob is describing, I believe we'd need some form of callback function prototype or interface definition to support this, as the when statement argument (an XPath expression) refers to data node instances, something which goyang isn't itself concerned with. This callback or interface would allow goyang to query a datastore implementation to determine whether the node was constrained by the when statement or not, providing the decision as to whether the provenance is recorded along with the uses itself being applied, or not (the validation step mentioned).

The alternative to this approach, would be something more like the approach used by libyang and pyang, which traverse the schema tree (including delving into uses statements) for each query. To do this, they keep track of the current "text" module and the "top" module seen, providing the namespace/prefix/other information, in a "schema context" object. This means they need not record provenance data, because each time you ask the library about some schema node, it follows the path from the root (or some parent schema node identifier).

robshakir commented 5 years ago

I don't think we should make goyang aware of the datatree. AFAICS, the approach that we've generally taken up to now is such that we make the Entry something that a downstream tool (e.g., ygot) can work with to implement validations. The ytypes library in ygot does this, and actually implements the latter type of traversal [0] that @andaru mentions.

In my view, the intent of the original change was (as above) to allow us to store the uses and augments such that when conditions could be implemented. Essentially we said:

let $e_schema be a yang.Entry that we are examining
let $e_data be a datatree node corresponding to an instance of the yang.Entry described by $e_schema
for $t in $e_schema.Uses or $e_schema.Augments:
  if $t.When() != nil:
    check_condition($e_schema, $t.When(), $e_data)

where check_condition would check $e_data against $e_schema using the contents of $t.When() as the condition. This seems a reasonable design to be in the downstream tool, and goyang can fill the requirements of providing $e_schema to be traversed over. The inefficiency that we've introduced, as I see it, is that we're storing Uses and Augments even when there are no when conditions on the downstream nodes. The optimisation we can likely make is simply to only store it when When would be non-nil for the Uses or Augment. We don't actually need it for when statements that are on nodes that themselves are schema tree nodes.

Is there an argument to make goyang at all aware of the implementation of $e_data? This seems like it would be required for a callback, and would rather introduce the idea of data tree validation at all into goyang which seems blurry to me.

Cheers, r.

[0]: This is really expensive, and not a great idea on any large data tree instance. We have some prototypes and TODOs to fix this implementation.

andaru commented 5 years ago

The callback I refer to is the check_condition function. I believe this has to be provided by the data store implementation if goyang is not involved in data tree considerations.

I agree there's no need for goyang to consider data itself.

---- On Mon, 15 Apr 2019 18:10:11 -0700 notifications@github.com wrote ----

I don't think we should make goyang aware of the datatree. AFAICS, the approach that we've generally taken up to now is such that we make the Entry something that a downstream tool (e.g., ygot) can work with to implement validations. The ytypes library in ygot does this, and actually implements the latter type of traversal [0] that @andaru mentions.

In my view, the intent of the original change was (as above) to allow us to store the uses and augments such that when conditions could be implemented. Essentially we said:

let $e_schema be a yang.Entry that we are examining let $e_data be a datatree node corresponding to an instance of the yang.Entry described by $e_schema for $t in $e_schema.Uses or $e_schema.Augments: if $t.When() != nil: check_condition($e_schema, $t.When(), $e_data)

where check_condition would check $e_data against $e_schema using the contents of $t.When() as the condition. This seems a reasonable design to be in the downstream tool, and goyang can fill the requirements of providing $e_schema to be traversed over. The inefficiency that we've introduced, as I see it, is that we're storing Uses and Augments even when there are no when conditions on the downstream nodes. The optimisation we can likely make is simply to only store it when When would be non-nil for the Uses or Augment. We don't actually need it for when statements that are on nodes that themselves are schema tree nodes.

Is there an argument to make goyang at all aware of the implementation of $e_data? This seems like it would be required for a callback, and would rather introduce the idea of data tree validation at all into goyang which seems blurry to me.

Cheers, r.

[0]: This is really expensive, and not a great idea on any large data tree instance. We have some prototypes and TODOs to fix this implementation.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

andaru commented 5 years ago

@robshakir One part that's not clear to me, is if there's no when statement, what would the alternative for the datastore implementation be? Presently, goyang extends all uses statements presently (by calling ToEntry and merging the results into the tree), and a similar situation occurs for augment statements. To me, your suggestion implies that the datastore implementation would need to take care of more of the work (including namespace changes and so on). I agree that's probably the right approach, but it appears to be a bit of a departure from today. Am I understanding that correctly?

robshakir commented 5 years ago

Absolutely, I agree that the check_condition function needs to be supplied by the datastore implementation. My argument was just that as long as we expose, as public functions or struct fields, the different elements that we need, we need not even define anything further here (i.e., we don't necessarily need to provide an interface that this function much satisfy).

robshakir commented 5 years ago

I think this depends on what we decide to store in the contents of the when statement. Today, you're right that the data tree implementation needs to do some additional work to figure out what the contents of the when statement actually meant. We might want to break this down, or provide some additional data (like the mapping of prefix to module name) which makes things easier for the data tree implementation to work this out.

It wasn't quite clear to me what you were asking w.r.t there being no when statement. In this case, this is essentially what we have today, where a data tree implementation doesn't have visibility into when statements when they do exist.

openconfig / goyang

Tracking 'uses' for Entries #96