LAAC-LSCP / ChildProject

Python package for the management of day-long recordings of children.
https://childproject.readthedocs.io
MIT License
13 stars 5 forks source link

more flags to validate specific aspects of a corpus (eg for a specific process) #283

Closed alecristia closed 2 years ago

alecristia commented 2 years ago

Is your feature request related to a problem? Please describe. Currently, child-project validate corpus/ will do a series of checks that are perfect when setting up a corpus; and we have one flag that adapts the behavior of validate to the context of importing annotations by ignoring the absence of recordings.

We have other procedures that also require certain things of the corpus, which it would be great to validate to this end first:

Describe the solution you'd like I don't think we should create a flag for each process, but we could create flags that perform only subsections of validate. For instance child-project validate corpus/ --annotations vtc/converted/ can check that those annotations are complete & local; child-project validate corpus/ --audio raw/ converted/standard/.

I wonder whether this is not equivalent to the user doing datalad get annotations/vtc/converted/ -- I could imagine it is not, because such a command will not alert a user that a given child or recording doesn't have its corresponding vtc/converted annotation.

lucasgautheron commented 2 years ago

On it (#284)

lucasgautheron commented 2 years ago

Done! The new implementation is described here:

https://childproject.readthedocs.io/en/latest/tools.html#data-validation