Closed max-moser closed 4 months ago
Thanks for these issues Max! I think this one may already be possible and just need a documentation update.
Under the hood sf just uses the golang std library to create the temp files. These calls use the os.TempDir function which allows end users to set the directory through environment variables, see the docs here: https://pkg.go.dev/os#TempDir
Setting $TMPDIR
works like a charm, thanks!
We're using siegfried to get an overview of the file format landscape regarding the files uploaded into our research data repository. In our setup, we're using a VM that has limited RAM and "primary" disk space, but sufficient amounts of space on a network-based storage where the uploaded files are stored.
In order to speed things up, we're running a few (~4 seems to work well enough) instances of siegfried in parallel. However, it seems like these runs generate temporary files that fill up the entire space of our primary disk for brief periods of time (we have a few archive files that single-handedly exceed the space of our root partition).
Since the large storage volume is network-based, I don't think it'd be a great idea to mount parts of it under
/tmp
– instead, I think it would be nice to have a way of telling siegfried to use a different directory for its temporary files.