COMBINE-lab / cuttlefish

Building the compacted de Bruijn graph efficiently from references or reads.
BSD 3-Clause "New" or "Revised" License
81 stars 9 forks source link

KMC files are written to the current directory even when a separate working directory is given #13

Closed jnalanko closed 2 years ago

jnalanko commented 2 years ago

Hello,

I noticed that the KMC temporary files are written to the directory where the program is being run, even when a different working directory is given with the -w option. I think those files should probably belong to the specified working directory.

rob-p commented 2 years ago

Hi @jnalanko,

Thanks again for reporting this. I believe that the -w option directs only the files as considered temporary by cuttlefish, which is largely agnostic to the internal working of KMC. Nonetheless, I agree with you that the temporary working directory should refer to all intermediate files that we have the ability to control. @jamshed: how difficult would it be to ask / force KMC to use the intermediate / working directory we specify?

--Rob

jamshed commented 2 years ago

Hi @jnalanko: thanks for using cuttlefish! For some performance considerations regarding concurrent disk I/O details, we place the temporary KMC-files (kmc_*.bin) in the output directory (i.e. the directory for the final output file -o). The produced KMC databases—which constitute the main temporary files for cuttlefish proper—are actually placed in the working directory (i.e. -w).