Automate creation and publication of domain model documentation

scp93ch commented 2 weeks ago

When a new release is made, the CI pipeline should cause the documentation to be generated and published.

Documentation is created using csv2doc. It is published at e.g. https://spyderisk.org/documentation/knowledgebase/network/v6a3-1-4 (though that path can be changed as necessary).

One approach would be to have the CI run csv2doc and generate a documentation package alongside the domain model package and then trigger a webhook on the server hosting the docs to download and deploy the documentation package and update the /latest symlink as necessary.

panositi commented 6 days ago

First steps to update workflows to:

checkout csv2doc repository
install graphviz
setup python venv
run generate_and_show html pages

The generated pages are under build directory and takes few minutes to complete. The size of the generated pages is ~3.5GB. A compressed archive of that directory is ~0.5GB.

Before archiving the documentation, the build folder should be renamed to a versioned name, preferably one without spaces.

Publishing the documentation should be done via a webhook mechanism via POST and some TOKEN to restrict unauthorized access to that endpoint. The webhook endpoint should be able to unzip the documentation folder and place it in the right place preserving the versioned root path.

mike1813 commented 6 days ago

It is important that the generated documentation should be published at a URL that incorporates the domain model reference and version number. We don't want to have documentation published only for the latest version.

If we want to provide a link from system-modeller, we would need to ensure it can find the correct URL.

mike1813 commented 6 days ago

Following up a discussion with @wwaites, we should also be using the correct RDFS mechanisms to get the URL. Specifically:

we should be adding an rdfs:isDefinedBy property that specifies a URI that is actually a URL accessible on the Internet
we should be using HTTP headers to request the appropriate content from that URL, e.g., human readable HTML (if accessed by a browser), or machine-readable RDF (if accessed by a reasoner).

To get the first point right, we would need CSV2NQ to add the rdfs:isDefinedBy property, based on the domain URI and domain model version string. The first is included in a given version (CSV serialisation) of a domain model, but the second is not - it gets inserted via a command line argument by the CSV2NQ program.

To get the second point right, we need to deploy the domain model RDF (CSV2NQ output) as well as the CSV2DOC output at an online server, and ensure that server can select between them based on the HTTP get headers.

I suspect we may prefer not to bother about publishing a machine understandable representation to begin with. If so, someone should move that to a separate issue before closing this one.

wwaites commented 5 days ago

Whilst it is good practice to use rdfs:isDefinedBy and it is helpful in some circumstances (e.g. when introspecting a triplestore using SPARQL) it is not strictly necessary; it is enough to make sure the URIs themselves resolve. Usually this will be done with a 303 redirect, e.g.

Fetch https://example.org/resource/something/foo --->
<--- 303 redirect to https://example.org/docs/something
Fetch https://example.org/docs/something --->

Spyderisk / domain-network

Automate creation and publication of domain model documentation #141