nfdi4plants / arc-validate

Home of all the tools and libraries to create and run validation of ARCs
https://nfdi4plants.github.io/arc-validate/
MIT License
3 stars 2 forks source link

[bug] Collection of (current) bugs with validation pipeline #58

Open Brilator opened 9 months ago

Brilator commented 9 months ago

I'm currently working on a few ARCs that keep running into failing CQC pipelines due to divergent handling of ISA structure / naming / referencing between ARC commander (v0.5), ARCitect (v≤0.07, based on ARCCtrl).

Collecting here. Feel free to move to more relevant repo / discussion.

  1. CQC currently based off latest ARC commander configuration and hence failing for all ARCitect ARCs
  2. ARC commander (obviously) fails on ARCitect-created ARCs (e.g. arc export, arc a list, arc a register)
Brilator commented 9 months ago

@JonasLukasczyk @HLWeil

Brilator commented 9 months ago

Also, we might need a converter / upgrader between ARC specifications

HLWeil commented 9 months ago

Hey, I'm currently working on bringing all the functionality we implemented in https://github.com/nfdi4plants/ARCtrl to the ArcCommander. This should fix most of the problems as we built ARCtrl to be quite backwards compatible (E.g. metadata sheet names). Still thanks a lot for the list! This will be immensely helpful for testing.

And as a heads up: Most important things are those that fail when written by ArcCommander and read by ARCtrl. The other way around wont be much of a problem in the future.

Freymaurer commented 9 months ago

I think this is a huge issue of arc-validate. As it does not support deprecated and modern arcs created by ARCtrl/ARCitect. @kMutagene @omaus validation should absolutly be based on ARCtrl, as we have extensive deprecation support there.

kMutagene commented 9 months ago

validation should absolutly be based on ARCtrl

absolutely not, because that would prevent partial validation of anything that cannot be parsed by ARCtrl. The whole point of this library is validation being independent of any implementation of the ARC data model.

This library is on the brink of a 2.0 release that also supports validating any spec.

See diff between main and V2

omaus commented 9 months ago

I think this is a huge issue of arc-validate. As it does not support deprecated and modern arcs created by ARCtrl/ARCitect. @kMutagene @omaus validation should absolutly be based on ARCtrl, as we have extensive deprecation support there.

Validation library makes no sense at all if it is based on ARCtrl. Vice versa, you would need to restructure everything in ARCtrl to validation needs. Everything must be of option type in this scenario because otherwise the parsing fails.

Freymaurer commented 9 months ago

What about string literals?

kMutagene commented 9 months ago

I think we talked about something similar before. arc validate will do structural validation based on ontologies. We can and should create an ontology that contains valid names for metadata sheets and also tracks which are synonymous and deprecated.

Brilator commented 8 months ago

What's the progress on this? Sorry to be annoying, but it keeps being frustrating for users to upload an ARC and get two "failed" emails right away.

kMutagene commented 8 months ago

it keeps being frustrating for users to upload an ARC and get two "failed" emails right away

Is it the validate arc pipeline step that fails for this specific issue though? IIRC that is a problem of CQC running in an empty repo, which most certainly fails the other pipeline steps because there is no investigation file in a blank repo. One way to tackle this could be providing an arc template repo

Brilator commented 8 months ago

Yes it is the validate ARC pipeline.

  1. It fails because ARCitect does not create a folder .arc
Screenshot 2023-10-17 at 17 05 56
  1. It fails for study and assay folders created with ARCitect (v 0.0.10):
Screenshot 2023-10-17 at 17 04 56 Screenshot 2023-10-17 at 17 03 52
HLWeil commented 8 months ago

Regarding (1.): https://github.com/nfdi4plants/ARC-specification/blob/main/ARC%20specification.md The .arc folder is not part of the ISA-specification. It was initially meant as a config folder for the ArcCommander. I think the validation should not fail because of it being missing.

kMutagene commented 8 months ago

The .arc folder is not part of the ISA-specification. It was initially meant as a config folder for the ArcCommander. I think the validation should not fail because of it being missing.

We can hotfix that from the current pipeline before finishing 2.0, but i would want to keep these hotfixes on deprecated software to a minimum.

It fails for study and assay folders created with ARCitect (v 0.0.10):

this looks like a harder thing to pin down. If i find it immediately, we can also do a hotfix for this. Can you link a repo where this is happening @Brilator ?

What's the progress on this?

Slower than expected due to several external factors. It is being worked on though.

kMutagene commented 8 months ago

@ brillator i have pushed hotfixes for both these issues:

It fails because ARCitect does not create a folder .arc It fails for study and assay folders created with ARCitect (v 0.0.10):

i have tested it in production with an arc created with ARCitect v0.0.10: https://git.nfdi4plants.org/test/another-test

please give it a spin.

Brilator commented 8 months ago

Works like a charm. Much appreciated!!

Screenshot 2023-10-23 at 17 04 41