rsmp-nordic / rsmp_schema

JSON Schema for automatically validating RSMP messages
MIT License
1 stars 0 forks source link

Simplify spec/doc flow? #24

Open emiltin opened 1 year ago

emiltin commented 1 year ago

We currently have four repos related to rsmp specs and documentation:

rsmp_schema and rsmp_specifications includes all version as folders, while rsmp_sxl_traffic_lights only includes the latest (current) version and older version aer keept as git tags.

rsmp_schema is also a rsmp gem, which can be used to validate messages from ruby code.

We should consider simplifying this repo structure and workflow. We could use a single repo, which would include all specs for both core and sxl's in all versions:

The repo would also be used for publishing to GiHub pages.

The repo would also be a ruby gem, so you can use it for validating messages frm ruby code. When publishing a ruby gem, you can choose which files in the repo is included, so for example you can ignore all rst and html files and only include the relevant json schema and ruby files.

All the rsmp spec converters from sxl_tools could also be included, as well as the yaml->json schema converter which is currently in the rsmp gem. (We can add a ruby base CLI if we like).

I think this could simplify things and keep everything related to the rsmp specs and validation together. We would also avoid the need for triggering github workflow actions between repos.

Other points:

otterdahl commented 1 year ago

A few steps are needed before we can automate the yaml-rst convert and publish on the homepage.

emiltin commented 1 year ago

All the boxes above are checked. What are the next steps? I came back to this issue because I'm looking at how to setup repos for sxl for VMS'es and sensors, and now I wonder if i should just create folders in this repo.

otterdahl commented 1 year ago
* The rsmp simulator currently needs an special extended yaml format. It would be nice if it could use the original yaml source, with extra info in a separate file.

This has been done using a workflow in rsmp_simulator. . The extra yaml file is currently kept in the sxl-tools repo, but it might be better to move it directly to the rsmp_simulator repo.

All the boxes above are checked. What are the next steps?

Sharing a single repo would simplify some parts of the flow, no doubt. But I wonder how some things are handled, for instance releases. Github supports creating a release, based on a tag. The release can contain assets, for instance, any binary file. This has been useful for publishing pdf, excel or executable files (in the case of the simulator). In contrast of any artifacts created by workflows, they don't seem to expire after a while and it's easy to share links to those files from the home page. If we share a single repo for core and tlc sxl, how should we think about releases?

emiltin commented 1 year ago

Good point to consider. Here's an attempt at an overview of how we're publishing via releases, pages and rubygems:

GitHub Releases (versioned)

GitHub pages (not versioned): -rsmp-nordic.github.io: static rsmp nordic website

Gem (versioned):

The ability to validate rsmp messages is important to our rsmp gem and rsmp validator, and can also also a useful tool for others. You need access to both core and sxl schemas, in all version, so that you can validator any messages, according to the version in use. Previously it was done by using git submodules to checkout multiple version of core and sxl repos. But this was cumbersome to maintain and work with, so instead the rsmp_schema now has the different version as folders, for both core and sxl. This mean the schema files and the gem can be in one repo and you don't need git submodules. You just fetch the gem and it has all the files needed for validation.

We need to think about this a bit - perhaps is is better to go back to keeping schema files in separate repos, but finding a smarter way to import them to the schema gem.

It makes a lot of sense to keep repos simple and focused on one thing: Core, TLC SXL, etc. But validation and documentation will often straddle all specs. The things that need access to all specs/schemas:

emiltin commented 1 year ago

An alternative to git submodules it git subtrees: https://github.com/git/git/blob/master/contrib/subtree/git-subtree.txt

The main difference is that a git submodule is a git construct relies on special .git files. A subtree is just a wrapper to add files from another repo directly to your repo, and with a helper to update them when needed. But they are normal files in your repo and you can modify them if you like.

Another important difference is that when you clone the main repo, you get all the subtree files immediately (because they're jsut normal files), whereas with submodules, you need to run git submodule init and git submodule update to fetch the files from the other repos, which can be a problem for the rsmp_schema gem that needs to use the SXL schema files from other repos (Core, TLC..)

Perhaps subtrees would be a better way to pull the JSON schema files into the schemer gem if we store the yaml and json schema files in separete core and TLC repos.

At the moment, we keep the differnt version of spec in separate folders, not branches. Newer version include files from the older version using json directives. It makes it a bit easiser to keep an overview of all versions and fix issues that exist in all version. If we use subtree we could either keep that model, or split version into branches.

otterdahl commented 9 months ago

I suggest a first step to simplify the spec/doc flow would be integrate all versions of core and sxl into their respective 'master' branches (rather than relying on tags/branches). This makes management easier and it matches rsmp_schema.

We can also create workflow to publish core/sxl directly from their repos rather than use a separate repo (rsmp_specification). That repo currently contains a workflow for each version which is overly complex.

emiltin commented 9 months ago

The usual way to maintain versions is to use branched. I think we should try to follow that.

I think a basic question is whether we want to use the same versioning strategy for a spec, and the schema files for that spec. Say we release an sxl version 1.2. Later we find that we need to fix something in the json schema files for that version. Where does that update live, and how is it versioned?

One option would be to say that the schema files are part of the sxl release. So we do a minor version update and release 1.2.1 where the json files are updated.

Another option would be to say that schema files are separate and live somewhere else with their own versioning.

I prefer the first option.

As you state, the rsmp_schema repo currently keep the different core and sxl version in folders, instead of using branches. Instead of duplicating everything in each folder, later version include/reference schema files from earlier version. This makes it easier to fix things that relate to all versions, but also complicates things in other ways. Changing one version, can have unintended consequences for other version, if you're not careful. So there are some downsides as well.

The main reason I changed to this current current model was that git submodules was hard to maintain. But perhaps git subtrees can solve this.

The current model also means that the schema files and the schema gem code can live in the same repo, which is a bit easier, but debatable in the long term. I don't like that an sxl is currently split between several repos: the schema stuff is in rsmp_schema, the rts and issues is in a sxl repo, and publishing to web is in a third. I would prefer to consolidate everything related to a specific sxl in it's own repo, including yaml, json schema files, rts source and generated html versions. Same for core, except is has no yaml.

The rsmp_schema gem could then pull version from core and sxl repos using git subtrees, instead of git submodules.

It would mean going back to keeping version of an sxl in branches instead of folders. Each brach would inckluding the full json schema, instead of relying on includes from older versions. This also implies that if you find something that needs to be fixed in all version, you need to modify each branch, which is a bit cumbersome, but arguable simpler that managing includes between versions.

To summarize it would look like this:

core

tcl_sxl

schema gem

When you change the schema files in a repo, you need to update the relevant rsmp_schema subtree.

Another option would be to go back to my original suggestion of keeping everything related to both core and sxls in a single repo, and find a way to publish artifacts. But I think versioning is an issue and it just seems to be too many things in the same repo; issues, artifacts, versions.

otterdahl commented 6 months ago

I think it is good solution to use your first suggestion, e.g. keep the schema files as part of core/sxl + yaml source as part of the sxl repo.

It is also a good idea to use branches to manage versions instead of folders in the main branch. It is not unusual that we need to update something in an older spec (for instance broken links).

If we can generate the online version of core/sxl in their repspective repos, then I think we might be able to remove the rsmp_specification repo.

A good first step might be to remove the github actions for generating the documentation from the rsmp_specification repo and move it to the sxl/core repo instead.

After that we can move the json schemas/yaml sources. Is that ok?

tcl_sxl

  • rts (or can we rely on yaml)

We can't completly rely only on yaml since there are parts of the sxl which is not defined in the yaml file. But one of the rst files is generated from the yaml file - the one that contains the actual sxl. The other rst files contains supporting documentation such as definitions, coordination spec, examples, and so forth. The conversion is done using yaml2rst script in sxl-tools. After that, the rst files are in turn used to generate the html and pdf versions.

emiltin commented 6 months ago

Alright, let's use that approach. Converting the schema files to a branch model will require a bit of work. Also the schema gem must be converted to fetch the schema files from the core/sxl repos via git subtrees.

emiltin commented 6 months ago

An alternative to using git subtrees (or submodules), is to package the core and sxl repos as ruby gems. This way they are versioned and packaged in a standardised way, and it's very easy to import/update them from other repos, e.g. the schema gem.

There's good arguments for this route, see e.g. https://medium.com/@porteneuve/mastering-git-submodules-34c65e940407#24d3.

Turning e.g. the TLC SXL repo into a Ruby gem is not very intrusive. You just add a few specific files, and might need to rename/move a few folders, and can then publish it as a gem. The current rsmp_schema repo is already published as a gem, to it's easy to import into e.g. the rsmp gem.

emiltin commented 6 months ago

An alternative to using git subtrees (or submodules), is to package the core and sxl repos as ruby gems.

We need different versions of the schema files to be present for validation, but you cannot fetch different version of a gem at the same time using bundler (the gem package tool).

We could still go the gem route if we keep the different version in folders instead of branches, as we do now, but I think I would prefer using branches. So that would indicate we should use git subtrees.

otterdahl commented 6 months ago

I've moved the github actions for generating the documentation from the rsmp_specification repo and moved it to the sxl/core repo. The rsmp_specification is kept for a short while to show a 404 with a link.