openconfig / public

Repository for publishing OpenConfig models, documentation, and other material for the community.
Apache License 2.0
879 stars 643 forks source link

openconfig-yang-types.yang: Incorrect definition of pattern? #175

Open guangyingzheng opened 5 years ago

guangyingzheng commented 5 years ago

In openconfig-yang-types.yang, the string's pattern all begin with "^" and end with "$", but when we check RFC 6020, and check the referenced [XSD-TYPES], it mentioned "the regular expression language defined here implicitly anchors all regular expressions at the head and tail." That's mean "^" and "$" should not be there. And now when we use some 3rd commercial software two validate the RPC XML, they always report error, ex:

Error: character content of element "system-id-mac" invalid; must be a string matching the reguler expression "^[0-9a-fA-F]{2}(:[0-9a-fA-F]{2}){5}$"

In this example I use openconfig-lacp.yang, and it's use the type defined in openconfig-yang-types.yang.

leaf system-id { type oc-yang:mac-address; description "MAC address that defines the local system ID for the aggregate interface"; }

FYI: [RFC 6020] 9.4.6. The pattern Statement The "pattern" statement, which is an optional substatement to the "type" statement, takes as an argument a regular expression string, as defined in [XSD-TYPES]. It is used to restrict the built-in type "string", or types derived from "string", to values that match the pattern.

[XSD-TYPES] http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#regexs

A ·regular expression· R is a sequence of characters that denote a set of strings L(R). When used to constrain a ·lexical space·, a regular expression R asserts that only strings in L(R) are valid literals for values of that type.

Note: Unlike some popular regular expression languages (including those defined by Perl and standard Unix utilities), the regular expression language defined here implicitly anchors all regular expressions at the head and tail, as the most common use of regular expressions in ·pattern· is to match entire literals. For example, a datatype ·derived· from string such that all values must begin with the character A (#x41) and end with the character Z (#x5a) would be defined as follows:

In regular expression languages that are not implicitly anchored at the head and tail, it is customary to write the equivalent regular expression as: ^A.*Z$

where "^" anchors the pattern at the head and "$" anchors at the tail.

guangyingzheng commented 5 years ago

Hi @robshakir , would you please help to confirm this issue?

robshakir commented 5 years ago

Hi,

We have staged a change whereby there will be a extension that indicates that POSIX regexps are used in the file. Per the OpenConfig style guide, POSIX-compatible regexps are always used in OpenConfig models.

Cheers, r.

rinhomdf commented 4 years ago

Hi, has this issue been addressed?

github-actions[bot] commented 1 month ago

This issue is stale because it has been open 180 days with no activity. If you wish to keep this issue active, please remove the stale label or add a comment, otherwise will be closed in 14 days.