This adds a shell script which can be used both locally and on CI to download external files and process them and the input files in this repository to generate the Turtle files to be loaded onto the database.
Example execution
```
$ scripts/generate.sh
== CN_2024.rdf.zip ==
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 15.4M 100 15.4M 0 0 1192k 0 0:00:13 0:00:13 --:--:-- 3003k
Archive: CN_2024.rdf.zip
== combined_nomenclature ==
2024-10-09 12:28:09.462 | INFO | __main__:CN2024:17 - Reading input RDF file /home/bbguimaraes/dds/src/cauldron/sentier_vocab/sentier_vocab/CN_2024.rdf
2024-10-09 12:29:08.649 | INFO | __main__:CN2024:20 - Changing labels to remove notation
2024-10-09 12:29:39.224 | INFO | sentier_vocab.open_energy_ontology:__init__:58 - Parsing and creating Open Energy Ontology elements
2024-10-09 12:29:39.631 | INFO | sentier_vocab.utils:streaming_download:70 - Downloading oeo-2.5.0.zip to /home/bbguimaraes/.local/share/sentier.dev/oeo-2.5.0.zip
2024-10-09 12:29:41.234 | INFO | __main__:CN2024:28 - Creating reciprocal relations
2024-10-09 12:29:42.822 | INFO | __main__:CN2024:34 - Writing output TTL file sentier_vocab/CN_2024.ttl
== custom_products ==
2024-10-09 12:30:27.756 | INFO | __main__::24 - Created custom graph at /home/bbguimaraes/dds/src/cauldron/sentier_vocab/sentier_vocab/data/custom-products.ttl
== envo ==
:128: RuntimeWarning: 'sentier_vocab.envo' found in sys.modules after import of package 'sentier_vocab', but prior to execution of 'sentier_vocab.envo'; this may result in unpredictable behaviour
== model_terms ==
2024-10-09 12:30:33.778 | INFO | __main__:ModelTerms:11 - Reading input TTL file /home/bbguimaraes/dds/src/cauldron/sentier_vocab/sentier_vocab/data/model-terms.ttl
2024-10-09 12:30:33.788 | INFO | __main__:ModelTerms:13 - Creating reciprocal relations
2024-10-09 12:30:33.789 | INFO | __main__:ModelTerms:19 - Writing output TTL file /home/bbguimaraes/dds/src/cauldron/sentier_vocab/sentier_vocab/data/model-terms.reciprocal.ttl
== nace ==
2024-10-09 12:30:33.988 | INFO | __main__:CN2024:17 - Reading input RDF file /home/bbguimaraes/dds/src/cauldron/sentier_vocab/sentier_vocab/CN_2024.rdf
2024-10-09 12:31:37.455 | INFO | __main__:CN2024:20 - Changing labels to remove notation
2024-10-09 12:32:06.639 | INFO | sentier_vocab.open_energy_ontology:__init__:58 - Parsing and creating Open Energy Ontology elements
2024-10-09 12:32:07.808 | INFO | sentier_vocab.utils:streaming_download:70 - Downloading oeo-2.5.0.zip to /home/bbguimaraes/.local/share/sentier.dev/oeo-2.5.0.zip
2024-10-09 12:32:09.440 | INFO | __main__:CN2024:28 - Creating reciprocal relations
2024-10-09 12:32:11.052 | INFO | __main__:CN2024:34 - Writing output TTL file sentier_vocab/CN_2024.ttl
== open_energy_ontology ==
2024-10-09 12:32:55.795 | INFO | __main__:__init__:58 - Parsing and creating Open Energy Ontology elements
2024-10-09 12:32:56.211 | INFO | sentier_vocab.utils:streaming_download:70 - Downloading oeo-2.5.0.zip to /home/bbguimaraes/.local/share/sentier.dev/oeo-2.5.0.zip
== qudt ==
:128: RuntimeWarning: 'sentier_vocab.qudt' found in sys.modules after import of package 'sentier_vocab', but prior to execution of 'sentier_vocab.qudt'; this may result in unpredictable behaviour
2024-10-09 12:32:58.947 | INFO | sentier_vocab.utils:streaming_download:70 - Downloading qudt-qudt-public-repo-v2.1.43-0-g4d44787.zip to /home/bbguimaraes/.local/share/sentier.dev/qudt-qudt-public-repo-v2.1.43-0-g4d44787.zip
== supplements ==
$ git status --short
?? CN_2024.rdf
?? CN_2024.rdf.zip
?? cookies.txt
?? envo-sentier-dev.ttl
?? qudt.json
?? sentier_vocab/CN_2024.ttl
?? sentier_vocab/data/oeo-product-vocab.ttl
```
One thing I'm not sure about yet is some steps seem to have the same output file (e.g. combined_nomenclature and nace both say Writing output TTL file sentier_vocab/CN_2024.ttl). Is there a specific order of execution required?
This adds a shell script which can be used both locally and on CI to download external files and process them and the input files in this repository to generate the Turtle files to be loaded onto the database.
Example execution
``` $ scripts/generate.sh == CN_2024.rdf.zip == % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 15.4M 100 15.4M 0 0 1192k 0 0:00:13 0:00:13 --:--:-- 3003k Archive: CN_2024.rdf.zip == combined_nomenclature == 2024-10-09 12:28:09.462 | INFO | __main__:CN2024:17 - Reading input RDF file /home/bbguimaraes/dds/src/cauldron/sentier_vocab/sentier_vocab/CN_2024.rdf 2024-10-09 12:29:08.649 | INFO | __main__:CN2024:20 - Changing labels to remove notation 2024-10-09 12:29:39.224 | INFO | sentier_vocab.open_energy_ontology:__init__:58 - Parsing and creating Open Energy Ontology elements 2024-10-09 12:29:39.631 | INFO | sentier_vocab.utils:streaming_download:70 - Downloading oeo-2.5.0.zip to /home/bbguimaraes/.local/share/sentier.dev/oeo-2.5.0.zip 2024-10-09 12:29:41.234 | INFO | __main__:CN2024:28 - Creating reciprocal relations 2024-10-09 12:29:42.822 | INFO | __main__:CN2024:34 - Writing output TTL file sentier_vocab/CN_2024.ttl == custom_products == 2024-10-09 12:30:27.756 | INFO | __main__:One thing I'm not sure about yet is some steps seem to have the same output file (e.g.
combined_nomenclature
andnace
both sayWriting output TTL file sentier_vocab/CN_2024.ttl
). Is there a specific order of execution required?