stjudecloud / workflows

Bioinformatics workflows developed for and used on the St. Jude Cloud project.
MIT License
34 stars 10 forks source link

investigate appropriate `disk_size`s #88

Open a-frantz opened 1 year ago

a-frantz commented 11 months ago

@adthrasher can you think of a decent way to test how much disk space each task uses? The difficult part is that many of our tasks clean up after themselves (a good thing), but that means we need to snapshot disk usage during runtime. A more complicated task than just running du after completion.

We could just run everything on the cloud with the current allocations and see what fails. That sounds the most straightforward, and will be "real-world" in that the host OS will be using some of the disk unlike on the cluster.

Does HPCF have any tools that would help here? I've always been under the impression their disk space tools have a hefty lag, so maybe not.