LAAC-LSCP / ChildProject

Python package for the management of day-long recordings of children.
https://childproject.readthedocs.io
MIT License
13 stars 5 forks source link

improvements to instructions #339

Closed alecristia closed 1 year ago

alecristia commented 2 years ago

Is your feature request related to a problem? Please describe. We are reading https://childproject.readthedocs.io/en/latest/vandam.html to create tsimane2018, a dataset that already exists in some form. I'll be documenting thoughts & suggestions here

Describe the solution you'd like

(Overall, I'm feeling that the best thing may be to have video logs of lots of dataset conversions...)

Other thoughts The tutorial is great but we may want to preface it or postface it with a FAQ about how to get started from a dataset you already have that has a different organization. I'm really not certain. As we've discussed frequently, each dataset is unique, so this may be rather a guide for an expert user helping someone newer do their first import.

Perhaps the structure would be something like: 1) preparation: identify all the files you need (raw recordings, raw metadata, raw annotation files). Note what their structure is - but you don't need to make changes yet 2) Think about the easiest way to proceed: We've found that you probably need to make children.xlsx by hand; and many aspects of recordings.xlsx as well, but you can use ls to list the files that you need to inventorize in the recordings metadata. Consider using excel, where you can have formulas to calculate start time for recordings that have multiple files that need to be concatenated (and note there is a command for calculating duration, so you can create recordings.csv over several steps).

LoannPeurey commented 1 year ago

Overall I think this is more of a first tutorial and gives an idea of the steps involved. For a complete guide of creating a dataset, I think it is better to follow the handbook guide.

aefca54 6ca0272