This a very tricky and configuration-dependent issue but I think it is related to the wrapping. I tried to find a config-based fix, but I have not succeeded.
So what I have noticed is that when I make a minute run call with a cluster profile from an interactive node like:
minute run --profile /path/to/cookiecutter
I get this output:
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cluster nodes: 50
Job stats:
job count min threads max threads
------------------------------------ ------- ------------- -------------
barcodes 2 1 1
bowtie2 12 1 1
compute_effective_genome_size 1 1 1
And the matching jobs are all submitted with 1 as the cores config value. However, if I jump over the wrapper and just do the old Snakemake call:
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cluster nodes: 50
Job stats:
job count min threads max threads
------------------------------------ ------- ------------- -------------
barcodes 2 1 1
bowtie2 12 19 19
compute_effective_genome_size 1 1 1
So all the configuration in the CookieCutter profile seems to work, but somehow the cores assigned to each job match the maximum number of cores in the parent process (the one scheduling). So in an interactive job I get all cores equal to 1, whereas if I submit a job in the queue, maximum will always be the maximum I provided in the scheduler. So I guess when --cores=all is submitted from the wrapper, somehow snakemake understands all as in the maximum in the local scheduler.
So maybe the run CLI command needs to be aware that it is being run in cluster mode, somehow. I also tried to remove the --cores=all option from the CLI command with no success.
A bit offtopic but also related: I have found that a moderate amount of cores in the local process (e.g. 6) gives a reasonable result, which makes me wonder if it would also be good to adjust: fatter scheduler that does local tasks faster (parallelizes a bunch of them), and slimmer requirements for tasks like bowtie2 or scaled_bigwig which are difficult to schedule sometimes due to the high amount of resources required. On top of that, snowy and rackham have different amount of ncores which always results in difficult config issues, so maybe adjusting to min(rackam, snowy) would simplify some things without slowing processing down.
This a very tricky and configuration-dependent issue but I think it is related to the wrapping. I tried to find a config-based fix, but I have not succeeded.
So what I have noticed is that when I make a
minute run
call with a cluster profile from an interactive node like:I get this output:
And the matching jobs are all submitted with
1
as the cores config value. However, if I jump over the wrapper and just do the old Snakemake call:So all the configuration in the CookieCutter profile seems to work, but somehow the cores assigned to each job match the maximum number of cores in the parent process (the one scheduling). So in an interactive job I get all cores equal to 1, whereas if I submit a job in the queue, maximum will always be the maximum I provided in the scheduler. So I guess when
--cores=all
is submitted from the wrapper, somehow snakemake understandsall
as in the maximum in the local scheduler.So maybe the
run
CLI command needs to be aware that it is being run in cluster mode, somehow. I also tried to remove the--cores=all
option from the CLI command with no success.A bit offtopic but also related: I have found that a moderate amount of cores in the local process (e.g. 6) gives a reasonable result, which makes me wonder if it would also be good to adjust: fatter scheduler that does local tasks faster (parallelizes a bunch of them), and slimmer requirements for tasks like
bowtie2
orscaled_bigwig
which are difficult to schedule sometimes due to the high amount of resources required. On top of that, snowy and rackham have different amount of ncores which always results in difficult config issues, so maybe adjusting to min(rackam, snowy) would simplify some things without slowing processing down.