Validator test metadata: github action

SteffenBrinckmann commented 9 months ago

Create a automatic validator (github action) that test if the keys that we agree on during the December 2024 meeting are in ro-create-metadata.json

Spoiler: some .elns do not fulfill the requirements yet (Kadi4Mat and ElabFTW do) See for example output: https://github.com/SteffenBrinckmann/TheELNFileFormat/actions/runs/7423821534/job/20202098171

Requirements:

ROCRATE_NOTE_MANDATORY = ['version','sdPublisher']
DATASET_MANDATORY = ['name', 'dateCreated', 'dateModified', 'identifier', 'text', 'keywords']
DATASET_INFO = ['author','mentions']
FILE_MANDATORY = ['name']
FILE_INFO = ['sha256', 'encodingFormat', 'contentSize', 'description']

Obviously, we should discuss, if these are the requirements that we want. This is my suggestion, which PASTA fails right now.

SteffenBrinckmann commented 9 months ago

@NicolasCARPi So you would prefer to have a python file inside the .github-workflows folder or in the main folder? And then the workflow calls it.

NicolasCARPi commented 9 months ago

Yeah I mean we already have tools/eln_validator.py so the action should use that and not hardcode something in the yaml. It also avoids having to maintain it at two places! Only one validator module must exist.

NicolasCARPi commented 9 months ago

Regarding https://github.com/TheELNConsortium/TheELNFileFormat/pull/60#discussion_r1445969622 I think DATASET_MANDATORY should only contain "name". And I suggest having a DATASET_SUGGESTED with the usual ones that you listed. Having a lack of keywords should not make the validator fail. Having keywords as array should. Having keywords attribute but it's empty should emit warning or notice.

SteffenBrinckmann commented 9 months ago

@NicolasCARPi I implemented it as you suggested. The only thing I did not do - because I do not understand - "Having keywords as array should." Can you give an example of false usage? I am just confused keywords, mentions, key-value pairs.

NicolasCARPi commented 9 months ago

Can you give an example of false usage?

# Must ERROR
keywords:
  - some tag
  - another tag
 # OK
keywords: some tag, another tag

SteffenBrinckmann commented 8 months ago

OK. Implemented a test that raises error if array is present (called list in python)

In related items: I think a keyword should be a single word and keywords is hence "winter,cold" or "winter cold"

I also added output that highlights were we as a consortium have not yet reached agreement on things

SteffenBrinckmann commented 8 months ago

If there is an interest: I can rewrite the code such that it writes the test-log to the folder in question. [This is only a temporary log and cannot be included in the repository, as per github's security policies (actions cannot modify repository)] I would then also add a .gitignore rule to exculed these logs from the local git-tracking.

SteffenBrinckmann commented 8 months ago

The tests now create a nice overview table in markdown for easy reading https://github.com/SteffenBrinckmann/TheELNFileFormat/blob/sb_validator_test_metadata/tests/logging.md

@FlorianRhiem @jmanideep Congrats for passing all the tests
@NicolasCARPi do you think this kind of overview makes sense?

NicolasCARPi commented 8 months ago

The tests now create a nice overview table in markdown for easy reading

Great, but I don't like that it creates actual files in the repo and commit them. What I suggest is to use something like this:

https://github.com/elabftw/elabimg/actions/runs/7550169076#summary-20555357193

You see, the Action itself has a summary and we can look at it easily. So your markdown table must be created and stored with GHA Summary: https://github.blog/2022-05-09-supercharging-github-actions-with-job-summaries/ (seems pretty easy to get working)

SteffenBrinckmann commented 8 months ago

Great suggestion, and proves that people don't know most of the great functions that exist out there. Implemented it such.

NicolasCARPi commented 8 months ago

So now these .md and .json files can disappear, right?

SteffenBrinckmann commented 8 months ago

Yes, you are correct. Sorry for the oversight.

TheELNConsortium / TheELNFileFormat

Validator test metadata: github action #59