Open kdoroschak opened 4 years ago
Removed milestone since this should happen ASAP, not just in the future
Q: Should we carry over information in /UniqueGlobalKey/ like operating_system? Maybe that should be left in the bulk file?
This would be easy to add back in later if needed, but also pretty annoying.
Draft version 0.1 saved for posterity: FAST5 specification.xlsx
Added segmenter & classification model versioning details
Made some changes while doing core rewrite for segment.py
. This is version 0.2 and should be final (or pretty close to it) for everything up through the segmenter (NOT post-segment filtering or classification).
Version control for this is currently being handled by the google doc (+ named versions in the edit history). Not sure if/where to put it in this repo.
Rather than having many intermediate files, record data in fast5 files similar to DNA.
Objectives:
Investigate best practices for file format specification documentation. I don't necessarily want/need to follow this to a T, but best to know what's out there.
Would like to have this for 1.0.0, as it greatly simplifies the workflow & data management overhead associated with the tool/pipeline.
Potential improvement based on this improvement: poretitioner could copy the config from an existing fast5 file. Maybe something like
--copy-config-from fast5_fname_here.fast5
?