oasis-open / tosca-community-contributions

OASIS TC Open Repository: Manages TOSCA profiles, tests, and templates that are maintained by the TOSCA community. They are intended to be used as examples to help developers get started with TOSCA and to test compliance of TOSCA implementations with the standard.
https://github.com/oasis-open/tosca-community-contributions
Apache License 2.0
37 stars 25 forks source link

Generalise scalar-units defintions #145

Open pmjordan opened 1 year ago

pmjordan commented 1 year ago

TOSCA is intended to be applicable to any domain but at present it only defines a few concrete scalar-units some of which are domain specific. We could add further scalar-units as demanded by users but this seems a slow and reactive approach. Ideally we would allow users to define scalar-units useful to them e.g. in a similar manner to data type definitions. However, I can see that TOSCA processors would need a knowledge of the meaning of the unit strings (like the ‘m’, ‘m’ and ‘d’ symbols which we already allow to mean minute, hour and day as ‘s’ for seconds in scalar-unit.time). An alternative would be to allow any scalar-unit from ISO 80000 since that covers many domains while using a fixed and defined set of prefixes (the familiar SI prefixes of ‘m’ milli, ‘M’ for Mega etc. and including KiB etc. for bytes). There are units libraries for common languages which can handle such unit strings. ISO 80000 must be paid for but a non-normative public document is at https://www.bipm.org/documents/20126/41483022/SI-Brochure-9.pdf

tliron commented 1 year ago

I definitely think this is possible. For Puccini's implementation the minimal requirements for a definition are as follows:

pmjordan commented 1 year ago

As I've said before I think making the units string case insensitive is, in general, poor practice. Rather than having a parameter to indidate whether or not the units are case sensitive a user could define unit aliases if this was really needed, e.g.


"B":   1  
"b":   1  
"kB":  1000  
"kb":  1000
"KiB": 1024
"kib": 1024 
tliron commented 1 year ago

I tend to agree, just showing that we could still support current behavior if necessary (which Puccini has to for 1.X parsing).

lauwers commented 1 year ago

I second Paul's proposal.

tliron commented 1 year ago

Just pointing out that if you want to use aliases, then for 3-character units, e.g. KiB, you would need 2³=8 aliases, which is annoying to write out. Case insensitivity is easy enough to implement as a feature.

But anyway I'm in favor of enforcing case sensitivity. Of course with user-defined units, users can do whatever they want.

pmjordan commented 1 year ago

Here's a syntax suggestion:

<scalar_unit_name>:
    version: <version_number>
    metadata: 
      <map of string>
    description: <datatype_description>
    # validation clause is implict; must be a unit_symbol_string+white space+data_value
    data_value_type: <data_type_name> #only data_types derived from either integer or float are permitted
    unit_symbol: <map of unit_symbol_name>
    metadata: <metadata_map>

<unit_symbol_name>:
    unit_symbol_string: <string>
    unit_symbol_multiplier: <integer or float>

and an example:

data_types:
    non_negative_integer:
        derived_from: integer
        validation: { $greater_or_equal: [ $value,  0 ] }

scalar_units:
    scalar-unit.bitrate:
    version: 2.0
    description: bitrate as defined as additional unit in ISO80000 but not including prefixes above 10^12
    data_value_type: non_negative_integer
    unit_symbol:
        "B":   1
        "kB":  1000
        "KiB": 1024
        "MB":  1000000
        "MiB": 1048576
        "GB":  1000000000
        "GiB": 1073741824
        "TB":  1000000000000
        "TiB": 1099511627776
pmjordan commented 1 year ago

While the above syntax is relatively simple and probably adequate it does have some shortcomings:

lauwers commented 1 year ago

Thanks @pmjordan. We'll make this a discussion topic for the Language Ad-Hoc once we finish substitution/requirement mappings.

pmbruun commented 10 months ago

I concur that units should be case-sensitive.

For time-units, TOSCA 1.3 only supports from nanoseconds ns up to days d, but the proposed scheme would not work for weeks, months or years. HPE SD supports scheduling and deadlines and longer durations than days, even taking into account individual business calendars, work-hours, vacations, holidays, and of course time-zones. My conclusion is that these would have to have some orchestrator built-in interpretation.

So when we open up for custom extensions to the TOSCA 1.3 scalar units, we need to allow built-in units that do not follow the symbol-multiplier scheme.

lauwers commented 10 months ago

So when we open up for custom extensions to the TOSCA 1.3 scalar units, we need to allow built-in units that do not follow the symbol-multiplier scheme.

I implement these types using a multiplier table that translates the unit specifier into a numerical multiplier. If we made such a table part of the grammar, then we could allow for custom scalar unit types.