gazebosim / sdformat

Simulation Description Format (SDFormat) parser and description files.
http://sdformat.org
Apache License 2.0
163 stars 92 forks source link

naming conflicts in generated schema files (xsd) #632

Open FirefoxMetzger opened 3 years ago

FirefoxMetzger commented 3 years ago

The file world.sdf includes both model.sdf and state.sdf

https://github.com/ignitionrobotics/sdformat/blob/722b92c92e2b19c408f3d4846ea7fcb5dfa7ed58/sdf/1.8/world.sdf#L62-L68

state.sdf in turn includes model_state.sdf

https://github.com/ignitionrobotics/sdformat/blob/722b92c92e2b19c408f3d4846ea7fcb5dfa7ed58/sdf/1.8/state.sdf#L37

Both model_state.sdf and model.sdf define a <model> tag:

https://github.com/ignitionrobotics/sdformat/blob/722b92c92e2b19c408f3d4846ea7fcb5dfa7ed58/sdf/1.8/model.sdf#L2

https://github.com/ignitionrobotics/sdformat/blob/722b92c92e2b19c408f3d4846ea7fcb5dfa7ed58/sdf/1.8/model_state.sdf#L2

which in itself shouldn't be a problem, because they exist in different scopes, one inside <world> and the other inside <state>.

This include chain is reflected in the generated .xsd files; however, the includes are all placed at the top of their respective document, which makes both <xsd:element name='model'> tags appear on the same level inside world.xsd, i.e., the nesting of model_state's model is not preserved. This - naturally - leads to a conflict and in the current version model_state's <model> overwrites model's <model>, which is (very) undesirable.

I'm not 100% sure how to fix this, since I am not familiar enough with the XMLSchema spec, but I assume the solution is something along the lines of renaming state tags, introducing namespaces, or ensuring that model_state's model is included inside the state element (preferred, but not sure if feasible).

FirefoxMetzger commented 3 years ago

This appears to affect all elements that have an accompanying *_state.xsd, so model, light, and link,

FirefoxMetzger commented 3 years ago

This also affects <pose> defined in /sdf/v1.X/schema/types.xsd and <pose> defined in /sdf/v1.X/pose.sdf. This is a slightly different path than the X_state.sdf vs X.sdf files, but appears to have the same root cause and a similar result.

In this case (<pose>) I guess the solution is to remove the entry in types.xsd, since pose.sdf is meant to supersede it?

azeey commented 3 years ago

Yes, we have been trying to solve this issue in https://github.com/ignitionrobotics/sdformat/pull/535, but we got stuck and put it in the backlog for a while. In that PR, we tried to paste the included xsd element instead of referencing it to avoid the name collision, but that doesn't work because we have recursive relationships in models, i.e., a model can contain a nested model.

FirefoxMetzger commented 3 years ago

Thanks for the response @azeey . In my local clone, I implemented unique namespaces for each file with respective referencing. This produces a valid schema that is very similar to sdf, but it no longer describes existing sdf files, because tags in it should be namespaced as well (which they aren't). This kind of rules out the namespace avenue.

Judging from this and my half-knowledge of XMLSchema I am beginning to fear that the current SDFormat is no longer valid XML. At the same time, it should be possible to write a correct schema, as aliased elements can never occur within the same tag, which means that a valid schema should exist. More thinking is needed here.

that doesn't work because we have recursive relationships in models, i.e., a model can contain a nested model

Does this mean it would be okay to just have a single SDFormatSchema.xsd in the end?

azeey commented 3 years ago

Judging from this and my half-knowledge of XMLSchema I am beginning to fear that the current SDFormat is no longer valid XML. At the same time, it should be possible to write a correct schema, as aliased elements can never occur within the same tag, which means that a valid schema should exist. More thinking is needed here.

I still think there should be a way to generate the .xsd files from .sdf description files, but it will require more time and effort.

Does this mean it would be okay to just have a single SDFormatSchema.xsd in the end?

Yeah, that's what we were thinking. I think the only benefit of having multiple *.xsd files was that they can reference each other making the root.xsd a small file.

\cc @jennuine

FirefoxMetzger commented 3 years ago

A night of sleep does indeed do wonders.

Here is a proposal for a solution: Transpile each file individually and introduce a unique namespace for each file. During transpiling generate a complexType and an Element for each tag in the respective SDFormat. Replace <include>s with a tag for that element (in the parent namespace) using as type the complexType within the respective file's namespace. I haven't tested this solution yet, but if it works it should work for the general case and make SDFormat quite flexible/extensible.

For my specific situation (generating python bindings) I have a second solution, which is to again use namespaces but to not introduce complexTypes. Instead <include> tags are replaced with a <xsd:element ref="..." /> tag. This generates correct bindings; however, expects SDF tags to use the correct namespace, e.g. <model> would become <model:model> assuming that this is how the namespace is referred to inside root.xsd. What I can do (which is dirty, but works) is to then simply remove all the namespace checks from the stubs, and voila, I can parse SDFormat 🤷 . (With the exception of <pose> inside of <include>, see #633 )

FirefoxMetzger commented 3 years ago

Here is a proposal for a solution [...]

An update: I tried this and it seems to work. I (re-)wrote the ruby generation script in python - I'm more familiar with that language and hence faster - and I can now generate xsd which validates and which allows me to generate bindings from it. Are there internal unit tests for the generated xsd I can check against? It also addresses some of the other issues I raised.

If there is interest in this solution, I can contribute this generator upstream.

jennuine commented 3 years ago

Here is an integration test.

The solution also needs to work with nested models (e.g., nested_canonical_link.sdf) as well as a saved world from Gazebo classic. You can run the following command to check your generated xsd with another sdf model/world:

xmllint --noout --schema /path/to/generated.xsd /path/to/model.sdf
FirefoxMetzger commented 3 years ago

@jennuine thanks for that. Ill give it a shot once I am back from my vacation on Wednesday.

I checked it with the simulation world I am using and I didnt find any issues yet. It doesnt have nested models iirc, but I dont see them as a big issue for xsd ... maybe I am missing something.

Edit: my worlds dont have nested models, the xsd should support nested models.

jennuine commented 3 years ago

@FirefoxMetzger sounds good, hope you have a great vacation! I've updated the branch from https://github.com/ignitionrobotics/sdformat/pull/535 to include the two other cases that the solution needs to work with (in test/integration/schema_test.cc). Feel free to reach out if you have any issues or questions :smiley:

FirefoxMetzger commented 3 years ago

@jennuine I actually do have a question regarding this: I've noticed that the entire CI is failing because I am using dataclasses which have been introduced in python 3.7; however, the toolchain seems to use an older version of python.

Since Python 3.6 will stop receiving security updates in about 5 months (full end-of-life), would it make sense to upgrade the toolchain? How involved would this be?

Alternatively, I can remove the dataclasses dependency and use normal classes in the build script instead. Please advise :)

jennuine commented 3 years ago

@azeey is more familiar with this and can provide some guidance.

azeey commented 3 years ago

I believe our plan is to support Edifice on Ubuntu Bionic, which comes with python 3.6. Changing to a different version would require significant changes to how we distribute libsdformat, so I don't think it's feasible. I was even leery of switching to python thinking it would introduce a new dependency, but it turns out we already depend on python.

FirefoxMetzger commented 3 years ago

Okay, then I can remove the dataclasses dependency.

I was even leery of switching to python thinking it would introduce a new dependency, but it turns out we already depend on python.

I was thinking about that as well when I started this. Initially, I was refactoring the ruby script, but then I realized that the changes are so major that starting from scratch would be faster than continuing the refactor. I then noticed the python scripts in the repo and figured I might as well use the language I know better.

If you are worried about maintainability and think it would be better to keep things in ruby that's fine with me as well. My motivation for having working and up-to-date XSD was because I wanted to use the XSD to generate python bindings (and get IntelliSense suggestions + static type checking in my scripts). The bindings will live in ropy and I am happy to place the XSD generation script there, too. It makes no difference to me as long as I get valid/correct SDFormat XSD 😄 . At the same time, it shouldn't be too hard to translate the python script back to ruby; although this is something that I won't be willing to contribute.

@azeey Just let me know what you think is the best way forward and then we can take the correct next steps :)

azeey commented 3 years ago

I don't think maintainability would be an issue, so please go ahead with python. And thanks for the contribution!!

scpeters commented 6 months ago

I've started work on renaming the state tags (//state/model to //state/model_state, link to link_state, joint to joint_state, etc) in https://github.com/gazebosim/sdformat/pull/1377. It's a work-in-progress as the Converter isn't fully handling conversion of elements from 1.11 to 1.12, but it's a good chunk of the way there.