dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
71 stars 39 forks source link

step 3 - Fix hackishness #81

Closed isaacovercast closed 8 years ago

isaacovercast commented 8 years ago

First of all step 3 preview only previews refmapping, not the whole enchilada. Preview chunking should happen at the top level not inside refmap.

Also, file management is very hackish and potentially dangerous. Refmap creates a hidden dotfile version of the edited fastq and then overwrites the original fastq edits with unmapped reads. A better idea might be to leave the original fastq in place, write unmapped fastq to a new file and change the file that data.files.edits points to. Less danger of overwriting/losing the original files.

isaacovercast commented 8 years ago

Fixed. Preview chunking now happens for all data at step3 (inside splitacross). Basic way it works is preview grabs a chunk of the data, writes it to a tmp file and points sample.files.edits at the tmp chunk, then repoints to the full data in a finally block.

I rewrote refmap to do something similar for unmapped reads. It writes unmapped reads to a tmp file and changes where sample.files.edits points to, then at the end of the run, during cleanup_refmap, it repoints all sample.files.edits to the original data and cleans up the tmp fastq of unmapped reads.