core-wg / yang-cbor

Internet-Draft for CBOR representation of YANG data
5 stars 10 forks source link

validate example SID file #130

Closed abierman closed 2 years ago

abierman commented 2 years ago

ietf-system@2014-08-06.sid.txt

The ietf-system SID file was extracted by hand from the latest draft. All the broken path strings (for line wrap) were fixed.

The file was loaded into the yumapro internal JSON parser using the sid-file container template. Passed. No errors.

However, the SID files produced by pyang 2.5.0 (latest version) are incorrect and unusable. We will probably write our own SID file generator and not use pyang. The program is not really supported very well, and the SID generation is not at all ready for general use

lemikev commented 2 years ago

Would you please share the list of issues found with pyang 2.5.0.

abierman commented 2 years ago

I will verify on another linux PC to make sure my pyang install is correct. One issue is described in #79 I will have more time this weekend to look at it.

abierman commented 2 years ago

I updated pyang on another machine to 2.5.2. The problem is the same as I saw on 2.5.0

  pyang --sid-generate-file=10000:200 t60.yang

The SID file generated starts with the assignment-ranges, not the full sid-file that is defined in the draft t60.yang.txt t60@2013-10-01.sid.txt

Another issue: If --sid-generate-file has no value, pyang will hang and require control-C to quit

andy@andy-office-pc:~/swdev/ypwork/netconf/modules/test/pass$ pyang --sid-generate-file t60.yang 
^CTraceback (most recent call last):
  File "/usr/local/bin/pyang", line 4, in <module>
    __import__('pkg_resources').run_script('pyang==2.5.2', 'pyang')
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 658, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 1438, in run_script
    exec(code, namespace, namespace)
  File "/usr/local/lib/python2.7/dist-packages/pyang-2.5.2-py2.7.egg/EGG-INFO/scripts/pyang", line 545, in <module>
    run()
  File "/usr/local/lib/python2.7/dist-packages/pyang-2.5.2-py2.7.egg/EGG-INFO/scripts/pyang", line 335, in run
    text = sys.stdin.read()
KeyboardInterrupt
andy@andy-office-pc:~/swdev/ypwork/netconf/modules/test/pass$ 
abierman commented 2 years ago

Another issue: missing input and output nodes for RPC

lemikev commented 2 years ago

Thanks for the info.

How stable is the .sid file format? Since the initial introduction, this file format changed 7 times (draft r00, r03, r10, r11, r12, r16, r18). It will be helpful for developers to settle down on a stable format.

abierman commented 2 years ago

Valid concern. There are new features like encoding metadata that will change the SID file format again to number the "annotation" statements in RFC 7952

I suggest we add a new leaf to the SID file (very first leaf)

leaf sid-version {
   type string;
   description 
       "Identifies the 'ietf-sid-file' module revision date used to produce this SID file";
}
lemikev commented 2 years ago

About "There are no "item" entries for the /t60:rpc1/input and /t60:rpc1/output schema nodes"

If you look at https://datatracker.ietf.org/doc/html/draft-ietf-core-comi-11#section-4.6.1 The encoding the the RPC don't require a SID for the input or output, a single SID assigned to the RPC is used instead.

 REQ:  POST </c/Opq?k=myserver>
       (Content-Format: application/yang-data+cbor; id=sid)
 {
   60002 : {
     1 : "2016-02-08T14:10:08Z09:00" / reset-at (SID 60003) /
   }
 }

 RES:  2.05 Content (Content-Format: application/yang-data+cbor; id=sid)
 {
   60002 : {
     2 : "2016-02-08T14:10:08Z09:18" / reset-finished-at (SID 60004)/
   }
 }

In https://datatracker.ietf.org/doc/html/draft-ietf-core-yang-cbor-17#section-4.2.1

   *  In the case of an 'RPC input' or 'RPC output', deltas are equal to
      the SID of the current schema node minus the SID of the 'RPC'.
lemikev commented 2 years ago

About the issue "If --sid-generate-file has no value, pyang will hang and require control-C to quit"

As a standard feature, pyang accept a .yang file in stdin. In your example, if you paste the content of t60.yang and than do a CTRL D, you should get in stdout:

ERROR, invalid range in argument, must be '<entry-point>:<size>'.

abierman commented 2 years ago

OK, understand pyang wait-for-input issue

RESTCONF uses both input and output in protocol messages. Our internal server uses choice and case nodes in internal subagent messages.

IMO the SID file should not be CoMI-specific because this will cause different variants to be created. The size increase in SID files in not significant due to recording nodes for input, output, choice, and case.

The SID file algorithm should only ignore YANG pre-processor nodes (e.g. augment, uses) and number all post-expansion data-def-stmts.

abierman commented 2 years ago

I just re-read draft-ietf-core-sid-latest.

I cannot find any normative text that says which specific data-def-stmts in the RFC 7950 ABNF cause "item" entries to be created in a SID file.

I cannot find any normative text that would allow 2 implementations of a SID file generator to produce the same SID file for the same set of YANG module inputs. Even the non-normative Appendix B is not clear does not specify what nodes are saved.

           enum data {
             value 3;
             description
               "The namespace for all data nodes, as defined in
               YANG.";
           }

This text is not accurate since 'rpc', 'notification' and 'action' are not data nodes.

From RFC 7950:

    o  data node: A node in the schema tree that can be instantiated in a
        data tree.  One of container, leaf, leaf-list, list, anydata, and
        anyxml.

Suggested new text:

       The namespace for all schema nodes represented by data definition statements,
       after conceptual expansion and removal of all 'augment' and 'uses' statements has been done.
lemikev commented 2 years ago

If I understand currently, you propose to assign SIDs to schema nodes as defined by RFC 7950.

schema node: A node in the schema tree.  One of action, container,
      leaf, leaf-list, list, choice, case, rpc, input, output,
      notification, anydata, and anyxml.

If no meaningful use cases are demonstrated for SIDs assigned to choice, case, input, output, I recommend to apply the rule 'If it ain't broke, don't fix it'.

For the "data" namespace, I agree that the text need to be fixed.

             enum data {
               value 3;
               description
                 "The namespace for all data nodes, rpc and action, as defined in
                 YANG.";
             }
abierman commented 2 years ago

The current draft does not say whether a node is encoded or not.

IMO this draft should map YANG schema items in a consistent manner. There are use-cases for mapping all schema nodes in the YANG schema item mapping. An implementation does not have to store the SIDs for schema nodes it does not care about.

IMO to do otherwise will insure there may be different SID file contents for each generator and each protocol that uses YANG/CBOR.

abierman commented 2 years ago

It is quite consistent with IETF procedures to limit the scope of the SID draft to CoMI. Now I think a 1 size fits all SID numbering scheme is not realistic. The unofficial YANG Push requirements should not cause the CoMI solution to be changed.

It the NETCONF and NETMOD WGs ever get far enough along to standardize a YANG Push encoding using CBOR then the SID file RFC update or a new type of file will be created then.

The current list of SID assignments is schema-node without choice, case, input, output. It also includes rc:yang-data (RFC 8040) and will include sx:structure (RFC 8791)

Perhaps change "data node" to "schema node" for the data namespace description.

It could be argued that these drafts refer to conceptual YANG schema nodes and they are not coupled to YANG statements. Then any mention of extensions is not needed.

abierman commented 2 years ago

There is 1 path string in the ietf-system example that is wrong. It appears that pyang is generating these paths correctly.

YANG SID failed: No object '/ietf-system:set-current-datetime/current-datetime' found in module 'ietf-system' Apply 'data' SID item failed (definition segment not found)

OLD

     {
       "namespace": "data",
       "identifier": "/ietf-system:set-current-datetime/
                     current-datetime",
       "sid": 1716
     },

NEW

     {
       "namespace": "data",
       "identifier": "/ietf-system:set-current-datetime/
                     input/current-datetime",
       "sid": 1716
     },
lemikev commented 2 years ago

The .sid file in https://comi.space/list-sid/ietf/public/ietf-system@2014-08-06.sid generated a couple of years ago with pyang is aligned with the NEW version. Where did you find this OLD version?

The proper path is for sure the NEW. The path must include input or output to avoid a path collision when the same leaf name is used in both directions.

abierman commented 2 years ago

I got the draft from this page https://core-wg.github.io/yang-cbor/

The example is wrong in that draft. It is also wrong in the code/ietf-system.sid file

  {
    "namespace": "data",
    "identifier": "/ietf-system:set-current-datetime/
                  current-datetime",
    "sid": 1716
  },
lemikev commented 2 years ago

I will update the pyang sid.py plugin for the latest file format including the option to generate SIDs for all schema nodes. This way, we will be able to generate a new ietf-system.sid file example.

abierman commented 2 years ago

OK -- looks like the line wrap was done by hand. That is unfortunate because the extracted SID file will not be usable until the path string is correct. Many path strings will overflow 72 chars, making RFC extracted SID files useless.

When you examine a SID file for an IETF module (even ietf-system) you cannot help notice the enormous number of long duplicate strings for data items. The alphabetical sorting is good. I wish the SID file contents could be more terse (e.g. nested so each segment is printed once)

cabo commented 2 years ago

We now provide an incantation that can be used to convert the text in the draft into the actual sid file. For an RFC that wants to include a SID file, use RFC 8792; we could add this here.

I agree that SID-files are not very well-compressed, but fixing that would be a larger project.

abierman commented 2 years ago

Benoit is willing to add SID file support to the Yang Catalog tools. It will really help to extract and store valid SID files.

Schema-based may be too rigid and verbose for schema nodes. Simple trees somehow...

  ietf-system:system-state 1720 {
       clock 1721 {
          boot-datetime 1722
          current-datetime 1723
       }
       platform 1724 {
           machine 1725
           os-name 1726
           os-release 1727
           os-version 1728
        }
        # etc...
   }
cabo commented 2 years ago

This is missing YANG's semicolons :-) I like that file format, but it is missing namespace. Can we define something like this in a new document, leaving the current Sid-file in place? (like yang tree and yang...)

abierman commented 2 years ago

Yes of courseit is missing semicolons. :-( The namespace is in the first schema node name. Just like JSON YANG, except no keywords. Just [schema-node, sid] tuples.

ietf-system:system-state 1720 {
   clock 1721 {
      boot-datetime 1722;
      current-datetime 1723;
   }
   platform 1724 {
       machine 1725;
       os-name 1726;
       os-release 1727;
       os-version 1728;
    }
    # etc...

}

mcr commented 2 years ago

It's a long thread that I didn't quite follow. What I'm hearing you say, Andy, is that the document does not provide a deterministic way to assign SIDs to nodes. Currently, we have only pyang, and so whatever algorithm it uses might not be reproduced clearly by another implementation. (This is bad, but maybe not fatal, if we are willing to accept changes/fixes from PS -> IS)

abierman commented 2 years ago

Correct. There is a recommended algorithm. There is only 1 SID file generator implementation right now -- pyang. It seems the official SID file is simply whatever somebody says . E.g., the SID file included in an RFC.

lemikev commented 2 years ago

Hi Michael

The list of YANG items assigned to SIDs is deterministic. However, at least one sentence in the draft need to be fixed to be fully aligned with the intent, the 'data' namespace includea rcp, action and notification, not just data nodes.

Which SID is assigned to which YANG item is not deterministic, this is why we have .sid files. SID assignment are dependent of the version history, if generated manually or automatically, and might be impacted by the specific generator(s) used.

.sid files are the final authority to known which SID need to be used.

Note: It's urgent to stabilize the .sid file format and to avoid alternative formats if we want to create an ecosystem. As mentioned above, this file format changed 7 times since its introduction.

abierman commented 2 years ago

I agree with the urgency. Only corrections should be made.

Only 7 times? Much better than the NETMOD WG with the client/server/ssh/tls/keystore modules. The over/under for combined draft totals at publication is 217 :-)

It would be wise to add a sid-version leaf (ala yang-version) as the first leaf. The other option is to just use a different top-element name later, like 'sid2-file' and foo@date.sid2

mcr commented 2 years ago

While the SID file is authoritative, what merits a SID sounded looser.

cabo commented 2 years ago

The namespace is in the first schema node name.

As usual, I was talking about the other namespace (module/identity/feature/data).

abierman commented 2 years ago

Sorry in advance for the long post...

The size of the I-D or RFC is going to about 2X if a SID file wrapped at 72 chars is added as an appendix. Not looking forward to an extra 25 - 50 pages of meaningless JSON not intended for humans to read. Extracting a useable SID file from such a text file will require special tools.

IMO this is a bad idea. Much better to make real SID files available online and put a link in the I-D. Maybe include the SID file in RFC only (plus the link).

Phase 2?

What if pyang could rewrite the YANG file to inject yang-sid extension statements? This would require more tools work so not ready to propose now.

Add to ietf-yang-sid:

 extension yang-sid {
     description
         "Identifies the YANG SID assignment for a YANG schema item.
          The 'sidnum' argument MUST be a valid non-zero SID number
          conforming to the 'sid' data type. The YANG schema item represented
          by the statement containing this external statement is assigned
          the specified SID number.";
      argument sidnum;
 }

pyang would inject the yang-sid extensions or that could be done by hand. This is much more robust way to ensure SID assignments do not change even if the YANG statements are refactored.

 module ietf-system {
      ...
      import ietf-yang-sid { prefix sid; }
      ...

      // module SID
      sid:yang-sid 1700;

      // feature SID
      feature foo {
          sid:yang-sid 1775;  // e.g. added in 2nd release
      }

       // identity SID
      identity authentication-method {
          sid:yang-sid 1701;
       }

      // data SID
      container system-state {
          sid:yang-sid 1720;
          container clock {
               sid:yang-sid 1721;
                ....
           }
       }
 }
core-bot commented 2 years ago

In 2021-12-04, at 20:06, Andy Bierman @.***> wrote:

Sorry in advance for the long post...

The size of the I-D or RFC is going to about 2X if a SID file wrapped at 72 chars is added as an appendix. Not looking forward to an extra 25 - 50 pages of meaningless JSON not intended for humans to read. Extracting a useable SID file from such a text file will require special tools.

Right. That’s why the current text of core-sid says:

For a YANG module approved for publication as an RFC, a ".sid" file SHOULD be included in the Internet-Draft as a source code block. This ".sid" file is to be extracted by IANA/the expert reviewer and put into the YANG SID Registry (Section 7.5) along with the YANG module. The ".sid" file MUST NOT be published as part of the RFC: the IANA Registry is authoritative and a link is to be inserted in the RFC.

So as long as the registry doesn’t have it, it can be in the I-D; I would expect drafts these days to point to the github repository where the SID file can be obtained as a separate file (possibly even negating the SHOULD). It never goes into the RFC but lives with IANA.

IMO this is a bad idea. Much better to make real SID files available online and put a link in the I-D.

Yes.

Maybe include the SID file in RFC only (plus the link).

The IANA reference, I’d say.

Phase 2?

What if pyang could rewrite the YANG file to inject yang-sid extension statements? This would require more tools work so not ready to propose now.

It also would add more noise to the YANG. (YANG is super-noisy already, so I don’t know how much of a concern that is, but I see a definite downside.)

I sure like what I see below. Let’s write that extension up...

Grüße, Carsten

Add to ietf-yang-sid:

extension yang-sid { description "Identifies the YANG SID assignment for a YANG schema item. The 'sidnum' argument MUST be a valid non-zero SID number conforming to the 'sid' data type. The YANG schema item represented by the statement containing this external statement is assigned the specified SID number."; argument sidnum; }

pyang would inject the yang-sid extensions or that could be done by hand. This is much more robust way to ensure SID assignments do not change even if the YANG statements are refactored.

module ietf-system { ... import ietf-yang-sid { prefix sid; } ...

  // module SID
  sid:yang-sid 1700;

  // feature SID
  feature foo {
      sid:yang-sid 1775;  // e.g. added in 2nd release
  }

   // identity SID
  identity authentication-method {
      sid:yang-sid 1701;
   }

  // data SID
  container system-state {
      sid:yang-sid 1720;
      container clock {
           sid:yang-sid 1721;
            ....
       }
   }

}

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

abierman commented 2 years ago

Glad it is solved for RFC publication. That only leaves the first 5 years of the YANG modules's existence to worry about ;-) The I-D submission WEB page should have a way to upload a SID file for an I-D but github is good enough

The YANG syntax is very powerful, yet simple. You can use YANG to extend YANG. YANG files can have comments and annotations, something JSON is not so good at. The schema node noise can be moved to a separate file. YANG supports "patch".

 module istf-system-sid {
      import ietf-system { prefix sys; }
      import ietf-yang-sid { prefix sid; }

      deviation /sys:system-state {
          deviate add {
              sid:yang-sid 1720;
          }
      }

      deviation /sys:system-state/sys:clock {
          deviate add {
              sid:yang-sid 1721;
          }
      }

     ....
 }

This is all standard and defined in RFC 7950. Unfortunately this patch cannot apply to modules, identities or features. Only schema nodes. New extensions would be required for that.

lemikev commented 2 years ago

Adding SIDs in yang files or using an data node tree representation was part of the options we look at. Those have been rejected for multiple reasons such as support of grouping, augment, identity, module, feature or other namespaces in the future.

The .sid files is a simple lookup database with used to map (namepace + identifier <-> SID). The JSON formal provide a simple text based representation for this lookup table, a single line of python is required to read or white a .sid file.

abierman commented 2 years ago

good point about groupings. the deviation approach still works even if groupings are used, but it introduces another file so does not solve the original problem.

I guess SID file size was not an important consideration compared to writing simple code to use it. The path approach at least numbers a parent before its children, which helps delta-SID.

abierman commented 2 years ago

One more point on the SID file and then I will shut up (The JSON format is fine)

The NETMOD WG is standardizing a YANG Instance File for YANG data that conforms to some YANG schema node. The dependency modules/revisions are handled there . It is not ready in time for SID anyway, but I like general solutions if possible.

abierman commented 2 years ago

pyang issues should be tracked on github. No other issues in this thread.