Closed aryarm closed 6 years ago
this article in the snakemake docs about resource allocation may be useful
It seems like things are less on fire now, especially after submitting the get_as_counts pull request into WASP. I'm going to close the issue for now
Our pipeline can sometimes use a lot of memory (especially when running the GTEX samples). This often results in issues where some jobs will get killed because we use all the memory on the cluster. Last I checked, this was happening when I used STAR to map reads (the second time). It also occurred when I was using WASP's
get_as_counts
.We can fix
get_as_counts
by implementing a lighter hdf5 version of it (see my fork of WASP), but I'm not sure about what to do with STAR. My guess is that it has something to do with loading all of the VCF data all at once but you should generally look into how much memory all of the rules use and whether there is anything you can do about it (besides just scaling down the number of jobs that snakemake will run at any given moment of time).