PacificBiosciences / FALCON_unzip

Making diploid assembly becomes common practice for genomic study
BSD 3-Clause Clear License
30 stars 18 forks source link

poor shutdown/restart handling with partially written files #26

Open flowers9 opened 8 years ago

flowers9 commented 8 years ago

There are places in get_read_hcgt_map.py (1), phasing.py (6), run_quiver.py (1), and unzip.py (1) where the code checks to see if output files from earlier parts are present before running subsequent parts. However, it doesn't check to see if those are files were finished being written to, merely present, and the code that writes those files doesn't have any way of signaling partially writen files. So it's quite possible, should the run be interrupted for some reason, for a restarted run to read partially written files and attempt to process them as if they were complete.

The straightforward solution is to write the files to a temp file, and then rename them after they're complete and closed.

pb-cdunn commented 8 years ago

Unnecessary. Use done files. Good point though.