LAAC-LSCP / ChildProject

Python package for the management of day-long recordings of children.
https://childproject.readthedocs.io
MIT License
13 stars 5 forks source link

Compatibility-breaking structure upgrades #97

Closed lucasgautheron closed 3 years ago

lucasgautheron commented 3 years ago

In order to make our format more flexible and attractive, we should make some changes to the structure of the datasets. I think we should have one recordings folder with two subfolders : "raw" and "converted" (or "processed", whichever you prefer). And we should do the same for annotations. For annotations, we should even do something like :

Why ? Because by nesting folders, we can benefit from DataLad's dataset nesting functionality. So, we could have everything into one dataset, but people could also split the datasets and set different permissions for all the recordings altogether, or for all annotations altogether, or for all sets of annotations except one if one of them might have identifying data (e.g. transcriptions)

I think this would make our solution much more attractive. What do you think ? Since this is compatibility-breaking, we should make this move early.

alecristia commented 3 years ago

"people could also split the datasets and set different permissions for all the recordings altogether, or for all annotations altogether, or for all sets of annotations except one if one of them might have identifying data (e.g. transcriptions)"

That is a key point. Other than that, the two seem equivalent. But this one argument heavily biases things in the direction of making the switch.

However, I fear this could break things downstream in N's code -- so tagging him here.

lucasgautheron commented 3 years ago

Yep, @alsonicr would definitely have to reflect these upgrades - but it should not take that much work. Once the implementation is set, he could do that in a separate branch of ChildRecordsR. I might be able to help. I'll try to bear most of the burden in any case (including upgrading the current datasets).

alsonicr commented 3 years ago

If the path is properly mention in the meta i should normally have noting to do, since file importation in based on meta data :D

lucasgautheron commented 3 years ago

Yes, but the metadata only contains the paths relative to something, and this something will change :/ but this should be super quick to reflect

alsonicr commented 3 years ago

"vtc/converted/childID/xxx.csv" it should look something like this non ? If so it should be fine

lucasgautheron commented 3 years ago

You might very well be right actually !