asam-ev / qc-framework

Mozilla Public License 2.0
17 stars 8 forks source link

A Proposal for a UID for rules accross different domains in Quality Checker #5

Closed MatteoRagni closed 7 months ago

MatteoRagni commented 9 months ago

Introduction

Is your feature request related to a problem? Please describe.

The feature discuss the possibility to introduce a Unique Identification (UID) for rules in the quality checker framework. The feature request has a related pull request.

Discussion

Current Candidate Solution

Describe the solution you'd like

The design of the UID for rule has been discussed in the OpenDRIVE subgroup, and the design hereafter described has been introduced in Framework workgroup. Even if is still a work in progress, we have reached an initial design stage that is good enough to init a PR for the framework doc corpus.

The current design of the UID is a (unfortunately quite long) string which encapsulate a sequence of concepts that allows to identify the string across the different domain. The concept are ordered and separated via a separation character.

In general, rules will be queried: querying should allow to move from generic to specific information. A query that can be performed directly on UID may be advisable, for example using UNIX pattern matching.

The proposal considers a notation in wich concepts are separated by a well defined character, like :

ASAM Concepts

All the other sub-concepts for ASAM are subject to discussion during actual definition of rules.

Example

Lets start with an example for a very complete UID, and lets break down its content. For the reference example a rule listed in the subsection Rules of one of the ASAM Standards for OpenDRIVE is used. This subsections were introduced for the first in the revision 1.6 of the standard.

ASAM OpenDRIVE Standard 1.6.0, Chapter 7.2 - Road Reference Line, subsection Rule, first point:

Each road shall have a reference line

Starting from the initial concept of emanating entity, it is possible to use this notation to actually give an UID for the quoted rule:

The complete UID would be something like:

asam.net:xodr:1.6.0:road.planview.geometry.rl_exists:1
-------- ---- ----- ---------------------- --------- -
 |        |    |     |                      |        |
 entitity | version  |                    name     rule-version (optional, simplified)
       standard    rule-set (name is always last in rulest dot-notation)

Regarding the validation schema itself, only the the first concept is fully required. Entities that are not ASAM itself are not required to declare rule with the concepts stated above. The schema is a formalism which is required only for rule under asam.net:*. The proposed schema suggests implicitly a hierarchy of concepts, from the broadest one (entity) to the most specific one (rule version).

Example: if my company has defined it's own rules, they may be something like: antemotion.com:custom_rule:1 and antemotion.com:custom_rule:2 and the UID should be considered valid. However, suggestions should be made to encourage custom rule to include all the concepts.

Query rules

The proposed formalism allow to perform query on rules (or set of rules) using UNIX style wildcards notation (python fnmatch). In the implementation of the checker framework is advisable that the end-user can enable or disable check based on the UID of the rules that the check verifies. This approach pushes agnostic approaches for check selection (i.e., for checker framework different with respect to the one offered by the ASAM OS project).

Examples:

OpenDRIVE QC Sub group

The main task inside the Standard QC subgroups is to define the best practice for naming their domain rules, taking for granted the root concepts asam.net:(qc|xodr|xosc|osc):?.?.? (entity, standard, version). The concepts that should be defined are:

It is stil open to discussion if rule-set and name should be separated by the concept separation character : or included as the very last element of the rule-set

the actual format of the rule-set and name may use a python notation (since python plays an important role in the project).

We can consider to impose a rule-set with the following pattern:

^(?P<RULESET>(([a-zA-Z][\w_]*)\.)*)(?P<NAME>[a-zA-Z][\w_]*)$

Which can be explained more or less as follows:

Examples:

road.planview.geometry.rl_exists
# RULESET: road.planview.geometry.
# NAME: rl_exists

only_name_no_ruleset
# RULESET: <empty string>
# NAME: only_name_no_ruleset

r.u.l.e.s.e.t.name
# RULESET: r.u.l.e.s.e.t.
# NAME: name

Playground: https://regex101.com/r/c9MQWE/1

Other Solutions Discussed

Alternatives considered:

Pain Points and Weaknesses

  1. The UID is very long. It can be a problem to search for it efficiently in Databases (it is not an integral type nor has a constant size) or may raise Path Too Long error (I'm looking at you Windows).

    The length of the UID comes with its advantages. It is possible to understand many of the information of a rule only by looking at its uid, even without reading the rule description. Furthermore, UID can be navigated much like a file system. It is possible to change base location using both an absolute or a relative reference (relative reference may start with a . character), or list all the sub-uid from that location.

    The total number of rules is low enough (probably in the thousands range considering all domains) that using this string in query for databases is a non-issue in terms of efficiency. In specific implementation scenario, if a faster query is needed, checksum of the original UID can be used as primary key (integral type with constant size, precalculated).

    Using a directory structure that mimics the structure of the uid, or using the entire uid for filenames can reach rapidly numbers that triggers the Path too long error in windows. This must be taken in consideration in all implementations (even in documenting the project)

    1. The version of the standard in the UID may create confusion

    This is the biggest weak point in the proposal. There are two main confusing situation:

    • from an end user perspective, request/remove the check for a certain rule requires a knowledge of the first version of the standard in which the rule appeared. Must be taken into consideration that to actually validate a rule, its uid is not sufficient. Indside the definition of the rule there will be a range of version on which the rule can be validated, and the current file should be inside that range.
    • from a developer perspective may be difficult to understand which rule id insert in a checker.

    Alongside this confusion there is a great advantage: if something changes dramatically between two standards, there can be two different rule id (that have the same rule-set and rule name) relative to two different rules. The different part is the version of the standard in the UID (asam.net:xoder:1.6.0:road.planview.geometry.rl_exists:1 != asam.net:xoder:1.8.0:road.planview.geometry.rl_exists:1). The probability that this will happen is quite low.

    1. Is a simple counter for versions enough, or do we need a semantic versioning?

    Discussion in original sub group determined that a version counter is more than enough to define the version. Initial suggestions used semantic versioning, but made the UID longer and more difficult to read / interpret. Still, please notice that the inclusion of the version of the rule in the UID is still open for discussion

MatteoRagni commented 8 months ago

The ruleset regex, in order to follow suggestion in #7 by @andreaskern74 should be as follows:

^(?P<RULESET>(([a-z][\w_]*)\.)*)(?P<NAME>[a-z][\w_]*)$

(removed capitals, forced snake case, comma separated concepts, name included in ruleset string, is the last comma seprated character group).

MatteoRagni commented 8 months ago

Documentation rule UID used should be updated as soon as possible with real UID, see #22