COVESA / vehicle_signal_specification

Vehicle Signal Specification - standardized way to describe automotive data
Mozilla Public License 2.0
326 stars 168 forks source link

Support of structs (in arrays) #326

Closed erikbosch closed 9 months ago

erikbosch commented 3 years ago

When investigating possible VSS-representation for project-specific use-cases I quite often find use-cases where the "source specification" use arrays of structs. A hypothetical example could be a list/array of deliveries to be done. In VSS as of today we support arrays on simple types like uint8[] and we support instances on branches, but we have no real mechanism to define structured types and to define arrays on them. Branches with instances does not really provide any solution if the array can have an arbitrary size.

Have there been any discussion to introduce some form of type definition for structs, possibly like below? Do we see a need for it?

DeliveryInfo:
  type: struct
  description: A struct type containing info for each delivery

DeliveryInfo.Address:
  datatype: string
  type: item
  description: Destination address

DeliveryInfo.Receiver:
  datatype: string
  type: item
  description: Name of receiver

DeliveryList:
  datatype: DeliveryInfo[]
  type: sensor
  description: List of deliveries
UlfBj commented 3 years ago

This interesting issue addresses questions related to the domain definition of the "Vehicle tree", and whether VSS should concern itself with other trees than the Vehicle tree.

Are there any "boundaries" to what data that may be added to the Vehicle tree?

Currently VSS only concerns itself with the "Vehicle tree", but the "VSS model" has no such limitations, and could very well be used to declare trees for other domains. Is that something that VSS should support? If so, should these trees be developed within VSS, or be developed by other communities?

oliveirarleo commented 3 years ago

I like this approach very much. This type: struct is similar to a branch with aggregate: true, so I wonder if having datatype: AnyBranch[] could be a simpler and more general solution to this (and many other) problems.

barbieri commented 3 years ago

in a way, part of it relates to https://github.com/GENIVI/vss-tools/issues/89

but in our use case, we also would have use to replace some of instances with actual arrays. Say Row[1,4] that expands to Row1, Row2, Row3 and Row4, in GraphQL we'd love to have that as rows: [RowTypeHere], each with some implicit id (or instance attribute) ranging from 1-4.

cc: @oliveirarleo, @g7fernandes

gunnarx commented 3 years ago

addresses questions related to the domain definition of the "Vehicle tree", and whether VSS should concern itself with other trees than the Vehicle tree.

I feel the original question is probably valid also if we stay within the Vehicle domain, but agree that other domains might add additional pressure to increase the feature set of the model.

So this is veering off topic a bit, but I'd like to give my opinion/answers:

... but the "VSS model" has no such limitations, and could very well be used to declare trees for other domains. Is that something that VSS should support?

Yes. I've brought this up before and we have been discussing around this issue a few times. In my opinion, the (meta)/model used for VSS can be recommended for other domains, in combination with VSSo/ontology approaches where they make sense.

But I think the logical thing is to name the underlying model behind that to something else, like Extensible Data Model, or similar, and let the VSS name remain for Vehicle related information (it's in the name...). Other domains might possibly need a little different metadata, and we might then say it is formally a different model (I guess depending on definition of the word and at which abstraction level you are thinking). But if this theoretical ExDM gets defined, then VSS model would then be a kind of "instance" of that higher level data modeling principle. (That sounds very formal but of course I realize we are still talking about a very simple modelling principle here. Part of its power is its simplicity...)

With input from you and others, this is taking shape and I have something brewing in my head. I feel now is the time to write up a proposal...

If so, should these trees be developed within VSS, or be developed by other communities?

All we can, and should do, IMHO is to propose this ability and help to organize the work. Communities get formed around these issues as they are needed, and they may include some overlap of people/companies. It's already happening, for example we already have ongoing cooperation with the insurance domain, which is going in this direction.

Final point

Are there any "boundaries" to what data that may be added to the Vehicle tree?

In combination with the above, it would then not hurt to discuss and see if we can write down a few guidelines about what belongs in the "standard VSS tree".

UlfBj commented 3 years ago

So this is veering off topic a bit

Yes it does, so maybe we should make it a separate issue? If we do not address this, I fear there will be a continuous scope creep that will not be to the advantage of VSS.

SebastianSchildt commented 2 years ago

Discussed again in call. Currently not "elegant" or easy to use/generate it with existing data model. Questions: How could VSS support? Do we want it? Or else: How does VSC fit.

kkoppolu1 commented 2 years ago

Structured data modeling is an area for which we are exploring VSS. We encounter structured data in the ADAS domain running ROS2 as an example. Having the ability to model structured data will enable us to model ROS2 .msg "messages" within the VSS tree.

For example, a hypothetical model of an obstacle list detected by the ADAS system could be,

ObstacleClass
{
    uint64 class
}

Obstacle 
{
    uint64 id
    ObstacleClass class
    float64 probability
}

ObstacleList 
{
    Obstacle[] obstacles
}

The requirement of modeling ROS2 messages is mentioned in https://github.com/COVESA/vehicle_signal_specification/issues/414 as well.

SebastianSchildt commented 2 years ago

It is obvious to me, that we really do need an answer to the question of more complex data structures, that is better/more specific than just “use VSC someday”.

I feel is valuable to keep (large part of) the standard catalogue consisting of mainly simple signals, but I think in usage it will be common to just want a little more.

I have the following asusmptions

I see two basic ways forward:

1. Extend /(mis)use the existing specification

We already have the (not really used) aggregate keyword. That is a start but not enough As a toy example consider the one given by @kkoppolu1 (my examples might not be syntactically correct nor complete, but you get the idea)

ADAS.Lidar.Front
    type: branch

ADAS.Lidar.Front.ObstacleList
    type: complexsensor[]   #this is like a branch but a sensor

#include ADAS/obstacle.vspec ADAS.Lidar.Front.ObstacleList

in ADAS/obstacle.vspec:

id:
    datatype: uint64

probability:
    datatype: float

Advantages:

ADAS.Lidar.Front
    type: branch

ADAS.Lidar.Front.ObstacleList
    type: sensor  
    datatype: jsonchema://datatypes/ObstacleList.json

or

ADAS.Lidar.Front.ObstacleList
    type: sensor  
    datatype: proto://datatypes/ObstacleList.proto

or

ADAS.Lidar.Front.ObstacleList
    type: sensor  
    datatype: ros2://datatypes/ObstacleList.msg

with e.g.

ObstacleList.json

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "array",
  "items": [
    {
      "type": "object",
      "properties": {
        "id": {
          "type": "integer"
        },
        "probability": {
          "type": "number"
        }
      },
      "required": [
        "id",
        "probability"
      ]
    },
    {
      "type": "object",
      "properties": {
        "id": {
          "type": "integer"
        },
        "probability": {
          "type": "number"
        }
      },
      "required": [
        "id",
        "probability"
      ]
    }
  ]
}

or ObstacleList.msg

Obstacle[] obstaclelist

and Obstacle.msg

int64 id
float64 double

or Franka, ARXML, etc.

Advantages

Neutral

Any input/opinions?

My 2ct: I would prefer solution 2. For 1 or 2 technologies we should spell out how it is expected to work/look like (e.g. ROS2 because everbody uses it, and jsonschema because it fits W3C stacks, but really anything goes).

Then you still have a strong standard catalogue (only using primitve datatypes), that you can expect to work with "any stack" anywhere, and you have some "endorsed" IDLs that one or the other implementation might be inclined to support. If you add your own IDL, you are probably on your own tool-wise, but at least you have a sueful "extension point" and did not "break" VSS (just like extending private signals / units))

kkoppolu1 commented 2 years ago

As supporting material to Option 1 (Extend the existing VSS specification), I have a proof-of-concept commit that demonstrates how we can use "struct" and "item" keywords suggested by @erikbosch (first comment on the issue ticket) to model complex types in VSS.

https://github.com/kkoppolu1/vss-tools/commit/342c94608e48a43df9d8aaf9bf2160391621b931

The commit includes updates to

Following are my thoughts/inputs on the pros/cons of each approach.

Option 1 (Extend VSS)

Advantages:

  1. VSS defines its own primitive types. Extending VSS to support complex/aggregate types is a natural extension of the specification
  2. The specification is self-contained and does not rely on an external specification integration
  3. We have vss-tools that convert vspec to standard specifications such as JSON, protobuf etc. In a similar fashion, we can build tools/parsers to translate from standard IDL formats to vspec.

Drawbacks:

  1. The specification may not be able to model all types possible in other IDLs.
    But as long as we are able to model complex types with the following characteristics, we will solve most use cases:

    Every item in a struct can be

    1. A primitve data type
    2. Another struct whose data type is defined
    3. An array of primitive data type
    4. An array of another struct whose data type is defined

Option 2 (Reference existing IDLs)

Advantages:

  1. Extensibility in terms of supporting standard IDL specifications

Drawbacks:

  1. There is a chance that the primitive data types in external IDL specifications will mismatch with the primitive data types within VSS. For example, protobuf has some fixed length types and bytes type that are not available in VSS. Similarly, JSON schema type system is different from that of VSS.
  2. Subsequently, VSS primitives will have to maintain parity with the primitives of whatever IDLs we want to support, adding to the effort and complexity of supporting additional IDLs
  3. Would we allow more than one IDL specification in a tree? For example, can a user use protobuf data type for one sensor, jsonschema for another, etc. ? If not, we would need additional validation to ensure only IDL specification is used across the VSS tree
  4. Relying on external IDL specifications to completely parse the VSS tree. (VSS will no longer be self-sufficient). vss-tools and any custom VSS parsers out there will now have to rely on these external specification tools as well.
SebastianSchildt commented 2 years ago

@kkoppolu1 thanks for making a working example. Have not yet played with code, but in the checked in example .vspec

https://github.com/kkoppolu1/vss-tools/blob/342c94608e48a43df9d8aaf9bf2160391621b931/Struct.vspec

/* [...]*/

Vehicle.ADAS.Lidar.ObstacleList:
    type: struct

Vehicle.ADAS.Lidar.ObstacleList.Data:
    type: item
    datatype: Vehicle.ADAS.Lidar.Obstacle[]

Vehicle.ADAS.Lidar.Front:
    type: branch

Vehicle.ADAS.Lidar.Front.Obstacles:
    type: sensor
    datatype: Vehicle.ADAS.Lidar.ObstacleList

Is there a technical reason why it is not just

Vehicle.ADAS.Lidar.Front.Obstacles:
    type: sensor
    datatype: Vehicle.ADAS.Lidar.Obstacle[]

the indirection over ObstacleList seems redundant to me in this case?

kkoppolu1 commented 2 years ago

It can very well be modeled in the way you described @SebastianSchildt. I created it this way for demonstrating defining a data type that contains an array of another data type. Such indirection may be useful when you want to extend the struct data type later by adding more items.

Both cases are valid:

  1. A sensor whose data type is an array of user-defined struct data type
  2. A user-defined struct data type containing an array of another user-defined struct data type.
SebastianSchildt commented 2 years ago

Discussion: Do we really want to inter-mingle datatype definition into the same tree (at least in output) as the VSS data model itself?

SebastianSchildt commented 2 years ago

Everybody who has stakes should look into this and comment. Open questions are

Maybe we shall have a seperate meeting?

felix-loesch commented 2 years ago

I think we need to differentiate the following two Topics:

  1. Adding entity (struct) datatypes allows the new entities that group properties together. This makes sense in my opinion if those entities are vehicle signal related.
  2. Misusing VSS as a general purpose modeling language by allowing the definition of arbitrary entities and properties outside the vehicle domain. This will lead to the definition of yet another modeling language. This we should not do for the following reasons:
    • There already exist good genera purpose modeling languages such as W3C RDF, RDFS and OWL which provide a large feature set and can be used. No need to reinvent the wheel
    • By allowing the definition of arbitrary domains the purpose of having a vehicle signal spec is defeated.
UlfBj commented 2 years ago

In the current model "type" represents what I think can be referred to as "data sources", except for "branch" which is a construct to support tree structures. In the proposal above "type" is extended with "struct" and "item", which I think is rather "datatype" metadata. The leads in my view to loss of clarity of the model. The parsing of the tree becomes more complex as there will be dependencies between nodes, and it is unclear what will be the result of accessing some of the nodes.

I would like to propose an alternative design that extends the "datatype" with the values "struct" and "struct[]", which when used MUST then be accompanied in the same node with the "struct" definition.

An attempt to apply it to the example from above could look something like below.

Vehicle.ADAS.Lidar.ObstacleList:
    type: sensor
    datatype: struct[]
    struct:
        id: int64
        probability: float
    ....

The datatype definition of a node is contained within that node. The "type" metadata stays unchanged. It should be possible to extend the model above so that nested structs would be supported. Syntax proposals are welcome. VISSv2 accessing of node data would not need any extensions, and payload encoding would be straightforward, {....,"data":{"path":"Vehicle.ADAS.Lidar.ObstacleList","dp":{"value":[{"id":"xx","probability":"yy"},..],"ts":"zz"},..}

adobekan commented 2 years ago

I guess we are again in the discussion what goes in the interface and what is the part of the model only. Or what do we have to model from a vehicle system and how at the end interface is realized.

If we go this way, next discussion will be how do we combine e.g. speed, position, wheel rotation and ambient temperature into custom message "myMessage" and can we model that in VSS?

I guess main motivation of VSS is how to have common setup around vehicle taxonomy/model and not messages or interfaces.

Something like this i would suggest


myLidar.vspec

Will be used later in ADAS Lidar

Obstacle:
    type: branch
     aggregate: true
## aggregate can be translated to struct or whatever you prefer in the tooling even custom made tooling

Id:
    type: sensor
    datatype: int64

Probability:
    type: sensor
    datatype: float

myMain.vspec

Vehicle.ADAS.Lidar:
    type: branch

Vehicle.ADAS.Lidar.Front:
    type: branch

Vehicle.ADAS.Lidar.Front.Obstacles:
    type: branch
    #include myLidar.vspec Obstacles

Then you introduce for your deployment setup specific config file or use layers, basically something that can help you with your specific IDL challenge for your deployment and usage of vss.

This might help you to generate some additional code automatically.

mycustom.depl

vehicle.adas.lidar.front.obstacles
    cardinality: many -> goes into repeated e.g
    type: struct, class, list whatever you find useful 
    ....

or you just generate granulated protobuf messages from VSS and rest of the code you write in your main protofile

-> message myCustomMessage( repeated VehicleAdasLidarObstacles myCustomData = 1)

SebastianSchildt commented 2 years ago

I took another stab at the "reference IDL" variant, that does not require hacking of tools

This takes advantage of the fact, that VSS already supports uint8[] as datatype and all existing tech stacks are expected to work with it.

The example would look like this

ADAS.Lidar.Front
    type: branch

ADAS.Lidar.Front.ObstacleList
    type: sensor  
    datatype: uint8[]

Then we may have an overlay myprotostack-overlay.vspec

ADAS.Lidar.Front
    type: branch

ADAS.Lidar.Front.ObstacleList
    type: sensor  
    protomsg: ObstacleListMessage

You might generate

python vspec2json.py ../spec/VehicleSignalSpecification.vspec -e protomsg -o myprotostack-overlay.vspec.vspec --json-pretty out.json

which will generate a JSON as before, and if you load it in e.g. a VISS stack such as KUKSA it will just deal with a uint8[] datapoint, no modifications needed.

You also might imagine a .protpo tooling, that is called the same way

python vspec2proto.py ../spec/VehicleSignalSpecification.vspec -e protomsg -o myprotostack-overlay.vspec.vspec --json-pretty out.json

That will express the VSS model in terms of protobuf and rolling the messages referenced by protomsg directly into the model

You could build similar things for other stacks/IDLs of course

Advantages

Disadvantages

Neutral

SebastianSchildt commented 2 years ago

I took the time to summarise/collect all proposals in our wiki

https://github.com/COVESA/vehicle_signal_specification/wiki/Complex-data-structures-in-VSS-based-systems

Please modify/add if I forgot something or did not get it right (if you have n Wiki access rights, just post here, maybe laso do it if you do change the wiki, so people notice).

This is not to say we should choose one of the approaches as is (as also not all are detailed the same way, and there might be corner cases), but I found concrete suggestions most helpful for this discussion,

danielwilms commented 2 years ago

Basic question is how to define interface definitions and where to place modifications. The following image argues, that the standard catalogue is free of custom needs of potential use of the standard catalog. It is the basic contract, and allows structural flexibility, while sticking to the definitions. The flexibility is defined outside the standard catalog. structs drawio

kkoppolu1 commented 2 years ago

I updated the wiki sub page with a note comparing branch-based modeling and IDL based modeling. https://github.com/COVESA/vehicle_signal_specification/wiki/Proposal-%22struct%22:-%22Weak%22-layer-reference-to-external-IDLs#comparison-with-modeling-using-branch

kkoppolu1 commented 2 years ago

I updated the wiki sub page with disadvantages and neutral stance points - https://github.com/COVESA/vehicle_signal_specification/wiki/Proposal-%22struct%22:-Just-model-as-branches,-the-rest-is-deployment

UlfBj commented 2 years ago

With the assumption that most cases of need for complex data types would be covered by support of struct and struct[] data types, where the struct fields are of simple data types, then I think a combination of the solution I proposed above, without support for nested structs, together with Sebastians proposal using uint8[] and overlay for the cases where nested structs are needed, or of abundant definition reuse, would be a balanced solution.

kkoppolu1 commented 2 years ago

Created a proof-of-concept commit for the approach outlined in https://github.com/COVESA/vehicle_signal_specification/wiki/Proposal-%22struct%22:-%22Weak%22-layer-reference-to-external-IDLs

kkoppolu1 commented 2 years ago

Created a proof-of-concept commit that utilizes branches to define types. The array support for branches is added via the already available arraysize property.

erikbosch commented 2 years ago

Meeting notes:

Discussion to be continued on next meeting. Please check linked commits above.

erikbosch commented 2 years ago

Meeting notes:

@kkoppolu1 to come up with examples on how to use it.

erikbosch commented 2 years ago

To make sure that I remember the old comments above; was the intention to use the syntax as described below.

Example without fixed size, there can be 0..* obstacles.

Vehicle.ADAS.Lidar.Front.Obstacles:
    type: branch[]
#include ObstacleData.vspec Vehicle.ADAS.Lidar.Front.Obstacles

Example with fixed size, there must be 10 obstacles.

Vehicle.ADAS.Lidar.Front.Obstacles:
    type: branch
    arraysize: 10

#include ObstacleData.vspec Vehicle.ADAS.Lidar.Front.Obstacles

Did we (or do we need to) say how individual instances in the array can be fetched (if supported by the stack/API), E.g. that the first instance shall be considered to have index 1, i.e. if arraysize is 10 indexes 1..10 are used?

kkoppolu1 commented 2 years ago

I think we wanted to use [] in both cases and specify arraySize for fixed size arrays.

Like so -

Dynamic:

Vehicle.ADAS.Lidar.Front.Obstacles:
    type: branch[]
#include ObstacleData.vspec Vehicle.ADAS.Lidar.Front.Obstacles

Fixed:

Vehicle.ADAS.Lidar.Front.Obstacles:
    type: branch[]
    arraysize: 10
#include ObstacleData.vspec Vehicle.ADAS.Lidar.Front.Obstacles
ramnanib commented 2 years ago

I’m not finding it intuitive to use the branch[] syntax to refer to an array of objects under the node. VSS is essentially a tree data structure, which is a graph. When we specify a node to be a branch, we’re saying this node has multiple edges, with a node at the end of each edge. Then what does branch[] exactly mean ? To me, the earlier suggestion of using the type struct is more intuitive. Adding one constraint that a struct type can not also be a branch. It’s a leaf node, just like sensors and actuators. I understand it turns VSS into an IDL, but I think it already partially is one, because it has datatypes.

Vehicle.ADAS.Lidar.Front.ObstacleList:
    type: struct[] // or message[] ?
Vehicle.ADAS.Lidar.Front.Obstacles.Obstacle:
    type: struct // or message ?
Vehicle.ADAS.Lidar.Front.Obstacles.Obstacle.Id:
    type: sensor
    datatype: uint32
Vehicle.ADAS.Lidar.Front.Obstacles.Obstacle.Probability:
    type: sensor
    datatype: float
erikbosch commented 2 years ago

Meeting notes: Bhushan presented his comment above. Important to get consensus so we can move forward. To be discussed in a meeting, date for meeting to be agreed on Slack

erikbosch commented 2 years ago

Same example as on top, but in form of an anonymous struct (I.e. struct content defined inline) Could be an alternative notation if you do not see any need to reuse the struct definition.


DeliveryList:
  datatype: struct[] /* If "struct" or "struct[]" is mentioned, it expects items to be defined inline below */
  type: sensor
  description: List of deliveries

DeliveryList.Address:
  datatype: string
  type: item
  description: Destination address

DeliveryList.Receiver:
  datatype: string
  type: item
  description: Name of receiver
kkoppolu1 commented 2 years ago

Meeting Notes:

  1. We discussed the pros/cons of using branch vs struct
  2. The group converged on the initial struct approach laid out by @erikbosch (see comment)

For the reasons:

  1. More explicit without leaving complex types open to interpretation by deployment tooling
  2. Allows sensor/actuator nodes to take on a complex data type in a more readable manner
  3. Does not mix various node types under a complex type (as with the branch approach)

As mentioned before, the PR for introducing the change should include

  1. the spec extension proposal
  2. standard tools update
  3. documentation
  4. examples
ramnanib commented 2 years ago

We also need to discuss and document best practices on how to model data as a branch versus a struct, as part of this development effort.

jdacoello commented 1 year ago

We also need to discuss and document best practices on how to model data as a branch versus a struct, as part of this development effort.

From my perspective, the definition of vehicle-related data points in the specification must be atomic (as it is now). The specification is responsible for the definition and description of what each property means. The proposal of a struct would enable the grouping of multiple properties. This is partially done with the branches already. I see a potential risk of users trying to use the struct construct to group already defined VSS data points, and this intention must be avoided. The scope of a struct should be limited to define group of properties that belong together and that are not already defined in the VSS tree. In other words, no cross-reference to already-existing VSS data points should be possible in the struct.

With that consideration, I think we will soon face the limitation of a struct. In summary, I see the following implications (correct me if I am wrong):

erikbosch commented 1 year ago

A typical example that often comes up is GNSS data - today specified in Vehicle.CurrentLocation branch. There have been some proposals to move (at least some of) those signals to a new signal of struct type. Rationale is that you always want to treat lat/lon together, if you read/write them individually you might end up with a totally irrelevant location. But if we agree to that, then of course the old individual signals shall be removed, possibly after a deprecation period.

erikbosch commented 9 months ago

We now have struct support so I think this issue can be closed