Closed CodyCBakerPhD closed 3 days ago
I agree that (b) is preferable, and I like the idea of having at least one redundant field as a check for this project, especially since this is being done by hand.
As it applies to current projects, I like the idea of iterating starting from 0 (or 1 depending on indexing preference) per each day/session (e.g. first recording of the day is 1, second of the day is 2, etc.). Once the list of recordings to include is finalized, an overall yaml file like this one could be generated, where the global subject # indicates the project-specific ids, and the subject_id info field is the internal tag. Then this info field would act as a double-check that we are pointing to the right pumpprobe folder (the ith folder from the given date). Again in this case the subject_id info field is not useful to anyone outside the lab, but it might allow a lab member to locate and verify a subject's data more quickly.
It is decided - (b) it is then!
Once the file is done on your end, just send it off my way and I'll start incorporating it into the conversion
@emosb Here are some proposals for YAML structure from our meeting today
I think we reached agreement that you should send one initial file with all subject info for the functionality connectivity project, then for current projects have multiple files, one for each day of experiments (exactly like your current logbook but more specifically structured)
a)
This has the advantage that the subject ID is mapped in a dictionary for easy look-up; e.g., if I load in the YAML content in Python to a variable named
subject_info
I can dosubject_info[1]
and it will give me all the information related to subject with ID equal to1
b)
This is the same as (a) except the
subject_id
is also included in the 'info' of that subjectThis has the advantage that the value of
1
is explicitly seen to be thesubject_id
(in case there was any question about that)But it has the disadvantage that both fields could get 'out of sync' with each other, such as a non-sensical entry
c)
This is the one proposed from our meeting; it is the 'simplest' since it is just a container (specifically a list), but has slight disadvantage in that if anyone wants to do a lookup of a specific subject, they would have to iterate over each entry of the list and have a logical condition for the
subject_id
field being the particular value they are looking forBut all information would be kept bundled together as a single dictionary and a single entry in the list
I maybe have a slight personal preference for (b), but which one to choose comes down to what you personally would prefer since you are the one working with it the most
And let me know if you have any other ideas for alternate structures