snakemake / snakemake

This is the development home of the workflow management system Snakemake. For general information, see
https://snakemake.github.io
MIT License
2.29k stars 556 forks source link

Using a parent directory of an output of one rule as the input to another rule #2563

Open itaysol opened 10 months ago

itaysol commented 10 months ago

Hi for the snakemake team, I'd love your help with something. When I try to use a parent directory of an output of one rule as the input to another rule I get a missingInputException although I create the needed dir for the rule but its not mentiond as the official output of the rule that creates it because it's his parent dir. the two rules looks like this: rule assembly: conda: "env/conda-assembly.yaml" input: clean_fwd = os.path.join(output_dir, "fastq", "{id}", "{id}.clean_fwd.fastq.gz"), clean_rev = os.path.join(output_dir, "fastq", "{id}", "{id}.clean_rev.fastq.gz"), params: comparisonGroup = lambda wildcards: config["Samples"][wildcards.id[:7]]["comparisonGroup"] output: assembly_output = directory(output_dir+"/assembly/{comparisonGroup}/{id}_assembly") shell: """ shovill --R1 {input.clean_fwd} --R2 {input.clean_rev} --outdir {output.assembly_output} --assembler skesa

    """

rule create_assembly_groups: input: assembly_comp_group = output_dir+"/assembly/comparisonGroup{comparisonGroup}" output: assembly_group_only_contigs = directory(output_dir+"/assemblyContigsOnly/comparisonGroup{comparisonGroup}") shell: """ python scripts/contigsOnly.py {input.assembly_comp_group} {output.assembly_group_only_contigs}

    """
   Is there any way I can fix that in an elegant way? 
   thanks ahead, Itay. 
Hocnonsense commented 10 months ago

In your example, comparisonGroup seems to be defined by id, so you may specify certern output (not parent dir) in the same comparisonGroup as output.

However, i'm surprized that your id for assembly is not the id in the config["Samples"]? So I assume that you have a list of read ids named real_sample_id_list

rule assembly:
    conda:
        "env/conda-assembly.yaml",
    input:
        clean_fwd = os.path.join(output_dir, "fastq", "{id}", "{id}.clean_fwd.fastq.gz"),
        clean_rev = os.path.join(output_dir, "fastq", "{id}", "{id}.clean_rev.fastq.gz"),
    output:
        assembly_output = directory(output_dir+"/assembly/comparisonGroup{comparisonGroup}/{id}_assembly"),
    shell:
        """
        shovill --R1 {input.clean_fwd} --R2 {input.clean_rev} --outdir {output.assembly_output} --assembler skesa
        """

rule create_assembly_groups:
    input:
        comparisonGroup = lambda _: [
            i for i in real_sample_id_list if config["Samples"][i[:7]]["comparisonGroup"] == _["comparisonGroup"]
        ],
    output:
        assembly_group_only_contigs = directory(output_dir+"/assemblyContigsOnly/comparisonGroup{comparisonGroup}"),
    params:
        comparisonGroup = output_dir+"/assembly/comparisonGroup{comparisonGroup}",
    shell:
        """
        python scripts/contigsOnly.py {params.assembly_comp_group} {output.assembly_group_only_contigs}
        """
github-actions[bot] commented 4 months ago

This issue was marked as stale because it has been open for 6 months with no activity.