Closed BEFH closed 2 years ago
@mbhall88, I have an uncustomized branch, so I can initiate a new pull request once we've addressed your other concerns.
In terms of threads not being respected, if I didn't specify threads in the rule definition, Snakemake put a default of 1 in the jobscript and that was never overridden by the cookiecutter default value. In other words, if I set default_threads
to 2 and didn't specify threads:
, it would default to 1, not 2.
I still don't understand how the unit testing is working, but if the removal of default_threads
is wrong, then it won't be an issue other than lack of coverage of the work for project specification.
@mbhall88, I have an uncustomized branch, so I can initiate a new pull request once we've addressed your other concerns.
Whatever is easiest for you.
In terms of threads not being respected, if I didn't specify threads in the rule definition, Snakemake put a default of 1 in the jobscript and that was never overridden by the cookiecutter default value. In other words, if I set
default_threads
to 2 and didn't specifythreads:
, it would default to 1, not 2.
I see. Ok, happy to remove it then.
I still don't understand how the unit testing is working, but if the removal of
default_threads
is wrong, then it won't be an issue other than lack of coverage of the work for project specification.
The removal of default_threads is not wrong, it's just you need to mock the output of calling the default_proj() function. Like in the line I permalinked above.
@mbhall88, thanks. I'll try to figure out the tests on Friday. Do I also need to write tests for the new project feature and for my change I made after this PR that doesn't duplicate the -q
flag if queue is specified in lsf.yaml
?
@mbhall88, thanks. I'll try to figure out the tests on Friday. Do I also need to write tests for the new project feature and for my change I made after this PR that doesn't duplicate the
-q
flag if queue is specified inlsf.yaml
?
Updating an existing test with project-related settings would be sufficient. So one test that uses the default, and adding -P
to one of the config files and making sure that overrides the default would be enough.
Regarding the change to the -q
flag, it would be great if you could add something for that also. It should be as simple as copying another test and just adding a mock for a cookiecutter default and a config file with -q
in it.
If you can't figure any of this out let me know and I'll try and add some tests when I get time.
@BEFH this is a bit awkward now. While I wait for @leoisl to approve this PR, your recent commit to fix #47 has been pushed onto this PR.
@leoisl once you are happy with this PR, could you remove that commit on your branch and we might have to close this PR and open a new one that isn't on @BEFH's fork.
First, sorry again for my delay on reviewing this. I am overloaded and needed some time off.
I agree with removing the default threads option. The reasoning is: snakemake
defaults threads to 1 if the user does not specify the threads
directive, and there is no way for us to know if the user specified 1 thread or did not specify the threads
directive.
This is the submitted jobscript:
#!/bin/sh
# properties = {"type": "single", "rule": "concat_fastas", "local": false, "input": [], "output": ["output/all.fasta"], "wildcards": {}, "params": {}, "log": ["logs/concat_fastas.log"], "threads": 123, "resources": {"mem_mb": 1000, "disk_mb": 1000, "tmpdir": "/hps/nobackup/research/zi/leandro/temp"}, "jobid": 1, "cluster": {}}
cd /hps/nobackup/research/zi/leandro/snakemake_test && /hps/nobackup/research/zi/leandro/miniconda3/bin/python -m snakemake --snakefile '/hps/nobackup/research/zi/leandro/snakemake_test/Snakefile' 'output/all.fasta' --allowed-rules 'concat_fastas' --cores 'all' --attempt 1 --force-use-threads --wait-for-files '/hps/nobackup/research/zi/leandro/snakemake_test/.snakemake/tmp.8pvg9a13' --force --keep-target-files --keep-remote --max-inventory-time 0 --nocolor --notemp --no-hooks --nolock --ignore-incomplete --skip-script-cleanup --conda-frontend 'mamba' --wrapper-prefix 'https://github.com/snakemake/snakemake-wrappers/raw/' --latency-wait 10 --scheduler 'greedy' --scheduler-solver-path '/hps/nobackup/research/zi/leandro/miniconda3/bin' --default-resources 'mem_mb=max(2*input.size_mb, 1000)' 'disk_mb=max(2*input.size_mb, 1000)' 'tmpdir=system_tmpdir' --mode 2 && exit 0 || exit 1
and we can get both values for threads with:
self.job_properties.get("threads") = 123
CookieCutter.get_default_threads() = 1000
This is the case that works.
This is the submitted jobscript:
#!/bin/sh
# properties = {"type": "single", "rule": "concat_fastas", "local": false, "input": [], "output": ["output/all.fasta"], "wildcards": {}, "params": {}, "log": ["logs/concat_fastas.log"], "threads": 1, "resources": {"mem_mb": 1000, "disk_mb": 1000, "tmpdir": "/hps/nobackup/research/zi/leandro/temp"}, "jobid": 1, "cluster": {}}
cd /hps/nobackup/research/zi/leandro/snakemake_test && /hps/nobackup/research/zi/leandro/miniconda3/bin/python -m snakemake --snakefile '/hps/nobackup/research/zi/leandro/snakemake_test/Snakefile' 'output/all.fasta' --allowed-rules 'concat_fastas' --cores 'all' --attempt 1 --force-use-threads --wait-for-files '/hps/nobackup/research/zi/leandro/snakemake_test/.snakemake/tmp.3c84_mlu' --force --keep-target-files --keep-remote --max-inventory-time 0 --nocolor --notemp --no-hooks --nolock --ignore-incomplete --skip-script-cleanup --conda-frontend 'mamba' --wrapper-prefix 'https://github.com/snakemake/snakemake-wrappers/raw/' --latency-wait 10 --scheduler 'greedy' --scheduler-solver-path '/hps/nobackup/research/zi/leandro/miniconda3/bin' --default-resources 'mem_mb=max(2*input.size_mb, 1000)' 'disk_mb=max(2*input.size_mb, 1000)' 'tmpdir=system_tmpdir' --mode 2 && exit 0 || exit 1
and we can get both values for threads with:
self.job_properties.get("threads") = 1
CookieCutter.get_default_threads() = 1000
i.e. snakemake
tells us that the rule must be run with 1 thread, but it does not tell us if the user said this explicitly (through the threads
directive) or not (implicitly, by omitting the directive). As such, we can't decide if we should run with 1 thread (explicit usage) or with the LSF profile default threads (implicit usage).
There is actually a way to fix this, but I think it is cumbersome and as 99.9% of the cases the default threads should be 1, I don't think we should go through the trouble of fixing. By looking at the jobscripts above, we can actually see the Snakefile that snakemake should run and the specific rule it should run. As such, we could actually parse the Snakefile and infer if the threads
directive was specified or not. However, this gets complicated with things like included Snakefiles, modules, etc... It would be a lot of work for very little benefit, as I think almost all users would use default threads = 1
Merging #43 (9469fc0) into master (34c3c4c) will increase coverage by
0.44%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #43 +/- ##
==========================================
+ Coverage 88.67% 89.12% +0.44%
==========================================
Files 6 6
Lines 468 478 +10
Branches 47 64 +17
==========================================
+ Hits 415 426 +11
+ Misses 46 45 -1
Partials 7 7
Flag | Coverage Δ | |
---|---|---|
unittests | 89.12% <100.00%> (+0.44%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
{{cookiecutter.profile_name}}/CookieCutter.py | 76.00% <100.00%> (+4.00%) |
:arrow_up: |
{{cookiecutter.profile_name}}/lsf_submit.py | 89.15% <100.00%> (+0.69%) |
:arrow_up: |
{{cookiecutter.profile_name}}/memory_units.py | 100.00% <100.00%> (ø) |
This pull request attempts to do two things:
-P
is in thelsf.yaml
.The default project works perfectly, but the latest commit is failing the tests because I cannot figure out how to update the tests.
Linting and Blackening were successful. If you just want to merge default project, that's fine, but default_threads is no longer respected.