GATB / leon

Leon - FASTA and FASTQ read compressor
https://gatb.inria.fr/software/leon
GNU Affero General Public License v3.0
8 stars 4 forks source link

Leon uses current directory for temporary files #7

Open KirillKryukov opened 5 years ago

KirillKryukov commented 5 years ago

Leon feels free to use current directory for storing temporary files.

For example, leon -file 'data/1.fasta' -c -kmer-size 2 will create a temporary file 1.h5 in current directory. (Normally it's deleted after compression, but is left dangling if compression crashes).

Using current directory as temporary space is problematic because:

  1. It's completely unexpected by the user.
  2. The name may clash with other files already stored there.
  3. Current directory might have insufficient space.
  4. The process may have no write access to current directory.
  5. When running concurrent leon tasks, these temporary files may clash with each other.

A slightly better solution would be to use output directory for temporary file storage (since we at least know that we have write access and most probably some free space there).

An even better way would be to use directory configured in TMPDIR environment variable. (Ideally with a command line option to specify another directory).

rchikhi commented 5 years ago

Thanks for the feedback. Output dir is a good idea, TMPDIR I'm not so sure as /tmp has often little space, and is even $TMPDIR set by all linux distros? (Not on my Ubuntu)

KirillKryukov commented 5 years ago

Yes, TMPDIR variable use is probably not universal. Also, on Windows, TMP and/or TEMP variables are used. So, perhaps safest is to default to using output directory.

In any case, adding any mechanism for specifying temporary directory will be an improvement compared to current state. Currently I have to chdir to a directory where I'd like to keep temporary files, then run leon, then chdir back.