saved-models / data-utilities

SAVED project - data processing tools for fish farms
4 stars 0 forks source link

Merged YAML manifest format #11

Open druimalban opened 6 months ago

druimalban commented 6 months ago

The change which I just merged on the call just now does a couple of things:

The manifest files generated are now in the LinkML YAML format. This is because we then edit these files directly, to specify jobs. In terms of usage of the programs, the only change is to specify a '.yaml' extension instead of '.ttl' or similar (note that these are converted when uploading to TTL).

The local data model git submodule is no longer necessary since we got it hosted on https://marine.gov.scot/metadata/saved/schema/. If you pull in the changes, do remove the directory ./fisdat/data_model/.

There is an empty/ignored example job added to generated YAML manifest files. It is in this section that we'd describe real jobs (which I will document elsewhere).

I am tracking this as an issue as the changes aren't necessarily obvious. I'll need to update the documentation to note the changes, but refer to this issue in the meantime if there's anything there which is wrong/incomplete.

druimalban commented 6 months ago

Initial revision to the documentation: https://github.com/wwaites/saved_fisdat/commit/17f650ff6fe0ecab44b3a2bd9a7cfdf3e34d8456

druimalban commented 6 months ago

I would further highlight dealing with missing data: https://github.com/wwaites/saved_fisdat/commit/199d122aaabb904cc52e00ba84e045bc0220d607

This is not a change from before, but it's fairly important for when we start submitting jobs to the pipeline.

druimalban commented 5 months ago

Writing documentation has involved quite a bit of testing of the functions, and rewriting them a bit so that the behaviour is consistent and clear. So, I've been working in a separate branch, for now: feature/unit_testing