COVESA / vehicle_signal_specification

Vehicle Signal Specification - standardized way to describe automotive data
Mozilla Public License 2.0
307 stars 157 forks source link

Enforce "style" for VSS standard catalog #700

Closed erikbosch closed 2 months ago

erikbosch commented 6 months ago

Now and then the topic on what identifiers that shall be allowed in VSS comes up. As of today we are quite flexible in documentation, we sort of say that as long as it is valid Yaml we support it. This makes however tooling more complex if it is to handle identifiers with odd characters or allowed/default values containing blanks. We do state recommendations in some places like for allowed values and node names and we have some support in vss-tools to give warnings or even fail if using strict mode.

But we do not enforce strict mode as a criteria for standard catalog, so as of today we have signals like this that does not follow the style guide (Values like "01" are not recommended):

PidsA:
  datatype: string[]
  type: attribute
  allowed: ["01","02","03","04","05","06","07","08","09","0A","0B","0C","0D","0E","0F","10","11","12","13","14","15","16","17","18","19","1A","1B","1C","1D","1E","1F","20"]
  description: PID 00 - Array of the supported PIDs 01 to 20 in Hexadecimal.

Some ideas:

Example:

A signal name Vehicle.Räksmörgås is valid in VSS as it is a valid Yaml name. VSS-tools generic functionality shall support it. But we should not accept it in standard catalog as it does not follow the rules for the standard catalog. Exporters/Generators does not need to support Vehicle.Räksmörgås, but if so they should give an error.

But if we are to change like this we need to change the PId signals as well, but they may anyway be removed by #635

ppb2020 commented 5 months ago

In my mind, the PidsA approach isn't bad and should be allowed. My reasoning is that the signal is defined to be in the form of a string with restricted values. "01" seems quite reasonable if we view it as a string that can contain any IA5/ASCII (and perhaps UTF-8) characters. Otherwise, our validation tooling would have to start looking into allowed string and impose rules such as "cannot start with a '0'"...

I agree with clarifying what we allow and in what situations. I like the RFC approach of using "shall", "may", etc. and describing what happens when, for example, "may" is used.

Indeed generic VSS tooling must be able to handle any VSS syntax. The set of characters we support for node names should guarantee that generated or exported files of any type can be created, but leave it open for users with specific needs and restricted set of generated or exported files can nonetheless use other characters as necessary.

I agree with the change to CI that enforces strict mode, but allows for this validation to be turned off.

Regarding:

Exporters/Generators does not need to support Vehicle.Räksmörgås, but if so they should give an error.

I would phrase this as:

When not in strict mode, Exporters/Generators should generate an error only when the resulting file would not be valid for that type of file.

Or something to that effect.

erikbosch commented 5 months ago

In my mind, the PidsA approach isn't bad and should be allowed. My reasoning is that the signal is defined to be in the form of a string with restricted values. "01" seems quite reasonable if we view it as a string that can contain any IA5/ASCII (and perhaps UTF-8) characters. Otherwise, our validation tooling would have to start looking into allowed string and impose rules such as "cannot start with a '0'"...

We already do that in other cases, like if you have a boolean variable and it does not start with Is you will get a warning, and in strict mode an error. I see in general two reasons to restrict characters or style. One is to try to keep the same style for all signals in the standard VSS catalog (just because it looks good) , another is to make it easier to create generators/exporters, reduced need for escaping/replacing characters.

The PidsA example concerns the VSS allowed concept. Some exporters/generators generates C-style enums from that and depend on that the allowed value is a valid C enum literal name, but then you get problems with PidsA as something like below is not valid C:

enum PidsA {
  01,
  02,
 ...
}; 

That is reason why we have a recommendation to use [A-Z][A-Z0-9_]* for the standard catalog, to have something that fit enum style guides for languages like Java and C.

My view here is that IF we want to have and enforce a character set for "allowed" for standard catalog, then we should make sure that PidsA follow it. I.e. either adapt PidsA, or change/remove the recommendation

erikbosch commented 5 months ago

Meeting notes:

ppb2020 commented 3 months ago

Further thoughts

If some characters are not supported in some output formats, can we define an encoding that vss-tools could use in those cases? Similar to URIs, where disallowed characters (like '/') can be encoded through the use of a percent sign followed by two numbers (e.g., %2F for '/'), we could define such an encoding in vss-tools output.

As to PidsA, I am fine with either adapting it or changing the recommendation, with a slight preference for changing the recommendation given this would allow the values to match the expected or standard values represented with a significant zero.

erikbosch commented 3 months ago

For "enum" like PidsA we removed the recommendation in a documentation update. I do not think it was ever checked within vss-tools so it was sort of just adapting to reality.

Concerning converting not-allowed characters, it would technically not be a problem to add a "sanitize" option in vss-tools to replace characters, called either always or on request. The problem is likely to agree on which characters to convert and how - the same rule may not fit all use-cases.

erikbosch commented 3 months ago

Fixes #730