Open ablot opened 2 years ago
Might make sense to keep the raw data as a separate entry for archiving purposes?
The best might be to not have any flexilims entries for the raw data. Often mutiple brains are processed at once and end up compressed and archived together (it the non stitched data is kept at all)
For stitched data, one option is to not have anything specific in flexiznam but instead manually create an adapted entity when needed. An example solution for both cricksaw and cellfinder datasets is here: https://github.com/znamlab/cricksaw-analysis/blob/dev/cricksaw_analysis/cricksaw_to_flexilims.py
One output entity example is here: https://flexylims.thecrick.org/flexilims/sample/show?sampleId=645e740b7ddb34517470c869
This has the advantage of separating the very specific code parsing log files to get relevant metadata from flexiznam. It makes automatic discovery of cricksaw datasets using "from_folder" impossible but I'm not sure it's a needed feature.
And here is an example cellfinder dataset created by the other function: https://flexylims.thecrick.org/flexilims/sample/show?sampleId=645d0d4c7ddb34517470c7e0
I think that manually adding them once the process has been checked (it did run and the results make senses) is probably more useful than a batch detection with a yaml like we do for 2p.
The alternative is to move these functions in new dataset subclasses. That would make them autodetect datasets but maybe will start to make too many subclasses for everything.
Any opinion @znamensk ?
The cricksaw "raw" data, coming out of the microscope can contain:
What do we add on flexilims? A single record for the main brainsaw folder? Or multiple dataset for each channel/resolution/downsampled data?