Snakemake-Profiles / pbs-torque

MIT License
16 stars 22 forks source link

Move some properties into resources #2

Open johanneskoester opened 6 years ago

johanneskoester commented 6 years ago

Some of the properties (like wallclock time or memory) would rather belong into the resources directive.

Some are redundant and encourage people to put stuff in the wrong place. E.g., I would not consider threads that are set within params instead of via the threads directive. Otherwise such a workflow would be not portable to different profiles or simply plain local execution anymore.

The workdir handling is also very custom and not portable.

ghost commented 6 years ago

I've moved all the various scheduler specific properties into the resources directive and got rid of the workdir for the moment. The only issue here is that resources only allow integers. This forces the user to set the walltime in seconds rather than using a string in the HH:MM:SS format. The second, and more annoying, issue is the memory can now only be allocated in bytes. I would prefer to set the memory request as a string using '1mb' or '1gb' etc.

Is there a particular reason why strings are not valid in the resources directive?

johanneskoester commented 6 years ago

It is advisable to annotate resources with the unit, e.g. mem_mb (see docs). The reason to be restricted to integers is that resources can be used to additionally constrain the snakemake scheduler, e.g. on a compute server. This way, you can give a maximum amount of a memory, and the scheduler ensures that the sum of mem_mb of all jobs does never exceed it.

There is no established standard yet, but I would propose the following resources to be considered:

johanneskoester commented 6 years ago

Thinking about it, it might be indeed a good idea to additionally support strings, and parse them with the humanfriendly python module. This way, we could easily achieve what you have in mind for sizes and times.

johanneskoester commented 6 years ago

Then, we could allow things like mem="5GB", and walltime="5h". A profile like this one would simply assume that they are given as bytes and seconds, respecively. And Snakemake would internally try to convert a string to size or timespan using the humanfriendly module. If that fails, it would return an error.

fpbarthel commented 6 years ago

Hi, I came across this searching for a feature to set walltime in the resources directive (and to be able to modulate it depending on the attempt number), along with queu, mem and threads. Is this feature still planned?

N.B. is there a specific reason that threads has its own directive and is not part of resources?

johanneskoester commented 6 years ago

Yes, this is stll planned. I hope to be able to work on this soon. Threads being separate is mainly for historical reasons, but also because it is the only resource that cannot be learned from previous runs. In the future, snakemake will learn the other resources from runs with previous samples, such that they just can be annotated in order to provide a prior guess.

jdblischak commented 2 years ago

it might be indeed a good idea to additionally support strings

Quick update for any that finds this Issue. Snakemake supports strings as resources as of version 5.15.0 released on 2020-04-21 (changelog)