DataBiosphere / toil

A scalable, efficient, cross-platform (Linux/macOS) and easy-to-use workflow engine in pure Python.
http://toil.ucsc-cgl.org/.
Apache License 2.0
900 stars 240 forks source link

Toil clusters have `/tmp` in memory when it should be on disk #4148

Closed adamnovak closed 2 years ago

adamnovak commented 2 years ago

As noted in https://github.com/DataBiosphere/toil/issues/4147#issuecomment-1159085124, when you launch a Toil cluster and connect to it, your /tmp is a tmpfs in-memory filesystem.

It should be an on-disk filesystem instead. We might need to do something with the Flatcar base image to make this work, because we might also need to share it between the Toil appliance container and the host.

┆Issue is synchronized with this Jira Story ┆friendlyId: TOIL-1190

adamnovak commented 2 years ago

I've done some more research and looked at https://systemd.io/TEMPORARY_DIRECTORIES/, and it seems like if we want a temporary file of large size, like gigabytes or more, Linux really wants that to live in /var/tmp, while /tmp itself should only be for temporary files that are small and would just fit in memory anyway.

Several distros (like Arch) are moving to make /tmp a tmpfs mount, and it seems like if we are going to be storing files that could be too big for tmpfs there, we'd want it on the instance's large ephemeral disks, if present, anyway. Which, because of the way we format those with a Bash script, would be a hard place to put /tmp, because we'd be trying to move it out from under the rest of the system.

I think the real answer is to make Toil, on a Toil-managed cluster at least, more likely to keep its temporary files in /var/tmp.

Python's tempfile doesn't really let you say that you want a big or a small temporary file or directory. And I think the user complaint here was about file copies in cwltool code which I think might be fetching its own temp directories from tempfile in a way we couldn't really intercept.

So I'm going to try setting a default TMPDIR when on the cluster to use /var/tmp and make sure to plumb it through when mounting into containers.