8.11.6
7.32.4
main (74627d3f07f549600dc8ad226a545f44191141e0)
Describe the bug
When defining a workflow with files imported as modules in a diamond-shape, there seems to be an interference between the rules loaded in the intermediate files. Imagine a workflow defined from a file with general rules common.smk; two intermediate files corresponding to two different processes/studies first.smk and second.smk; and a final file collecting the main results of the two previous ones all.smk, whose rules are renamed to avoid clashes. If we run snakemake using all.smk as an input, there seems to be an interference between the rules imported from common.smk in first.smk and second.smk, even if the final file all.smk renames the rules.
If one uses the syntax use rule * from common in the intermediate files it looks like Snakemake is simply considering the rules in common.smk from the latest file that imports them, although they can be accessed in the file through the rules object. On the other hand, if we write use rule * from common as * the execution works fine. This looks more like a bug than a feature, or at least a design error.
Probably related to #1872, #2729, #2838
Minimal example
Define the general file containing a rule that creates a file, where part of the path depends on the provided configuration:
Then make two separate files corresponding to two different studies, which simply make an alias to the file created with the rule in common.smk:
first.smk
import os
module common:
snakefile: './common.smk'
config: {"analysis": "first", "value": 1}
#use rule * from common as * # <--- works
use rule * from common # <--- fails if using the file all.smk
rule result:
input: expand(rules.write.output, value=1)
output: os.path.join('data', 'first', 'result.txt')
shell: 'ln -srf {input} {output}'
second.smk
import os
module common:
snakefile: './common.smk'
config: {"analysis": "second", "value": 2}
#use rule * from common as * # <--- works
use rule * from common # <--- fails if using the file all.smk
rule result:
input: expand(rules.write.output, value=2)
output: os.path.join('data', 'second', 'result.txt')
shell: 'ln -srf {input} {output}'
Finally, declare the file that collects all the results:
all.smk
module first:
snakefile: './first.smk'
config: config
module second:
snakefile: './second.smk'
config: config
use rule * from first as first_*
use rule * from second as second_*
assert(rules.first_write is not rules.second_write) # they always exist and they are always different, as expected
# if using "from common import *" in "first.smk" and "second.smk", then the way
# to obtain the files for the first result can not be resolved correctly
rule all:
input: rules.first_result.output, rules.second_result.output
In this case we are propagating as well some configuration values to verify that it remains different when we run rules from all.smk. The expectation is that the commands
snakemake -s first.smk result -j1
snakemake -s second.smk result -j1
provide the same output as
snakemake -s all.smk all -j1
but this is only true if we load the rules as use rule * from common as * instead of use rule * from common inside first.smk and second.smk. Otherwise you get the following error
MissingInputException in rule first_result in file [MASKED]/first.smk, line 10:
Missing input files for rule first_result:
output: data/first/result.txt
affected files:
data/first/input_value_1.txt
which suggests that somehow the rules that were imported from common.smk inside first.smk are not being considered.
Snakemake version
8.11.6 7.32.4 main (74627d3f07f549600dc8ad226a545f44191141e0)
Describe the bug
When defining a workflow with files imported as modules in a diamond-shape, there seems to be an interference between the rules loaded in the intermediate files. Imagine a workflow defined from a file with general rules
common.smk
; two intermediate files corresponding to two different processes/studiesfirst.smk
andsecond.smk
; and a final file collecting the main results of the two previous onesall.smk
, whose rules are renamed to avoid clashes. If we runsnakemake
usingall.smk
as an input, there seems to be an interference between the rules imported fromcommon.smk
infirst.smk
andsecond.smk
, even if the final fileall.smk
renames the rules.If one uses the syntax
use rule * from common
in the intermediate files it looks like Snakemake is simply considering the rules incommon.smk
from the latest file that imports them, although they can be accessed in the file through therules
object. On the other hand, if we writeuse rule * from common as *
the execution works fine. This looks more like a bug than a feature, or at least a design error.Probably related to #1872, #2729, #2838
Minimal example
Define the general file containing a rule that creates a file, where part of the path depends on the provided configuration:
common.smk
Then make two separate files corresponding to two different studies, which simply make an alias to the file created with the rule in
common.smk
:first.smk
second.smk
Finally, declare the file that collects all the results:
all.smk
In this case we are propagating as well some configuration values to verify that it remains different when we run rules from
all.smk
. The expectation is that the commandsprovide the same output as
but this is only true if we load the rules as
use rule * from common as *
instead ofuse rule * from common
insidefirst.smk
andsecond.smk
. Otherwise you get the following errorwhich suggests that somehow the rules that were imported from
common.smk
insidefirst.smk
are not being considered.