snakemake-workflows / rna-seq-star-deseq2

RNA-seq workflow using STAR and DESeq2
MIT License
327 stars 203 forks source link

trim.smk file name #42

Closed neavemj closed 1 year ago

neavemj commented 3 years ago

Good afternoon, thanks for the workflow!

I'm running into an issue with running the pipeline with trimming enabled. I'm getting the following error:

Missing input files for rule align:
results/trimmed/21_02764_01_S1_lane1_R2.fastq.gz
results/trimmed/21_02764_01_S1_lane1_R1.fastq.gz

After reading through the snake files, I think it is because the output from trim.smk has a slightly different name than expected by align.smk. For example:

trim.smk, line 29: fastq1="results/trimmed/{sample}-{unit}_R1.fastq.gz", common.smk, line 98: "results/trimmed/{sample}_{unit}_{group}.fastq.gz",

Note that the first line has wildcards separated by a hyphen, while the second are separated by an underscore.

Could this be causing the 'missing input file' error?

Thanks again!

Matt.

Rubbert commented 2 years ago

I get the same error when I enable trimming. Everything runs fine when it is not enabled. At the moment on line 98, it reads: "results/trimmed/{sample}_{unit}_{group}.fastq.gz",

which I changed to: "results/trimmed/{sample}-{unit}_{group}.fastq.gz",

The first underscore becomes a hyphen/dash, which matches what is in trim.smk. A very similar issue to what Matt/neavemj described. Thanks Matt!

dlaehnemann commented 2 years ago

Thanks for so clearly identifying this problem, @neavemj. Would you like to suggest the respective needed change in a pull request, so we can document this as your contribution to the code base? Or would you prefer that I quickly implement the fix to the code base with an acknowledgement to this issue right here?

My suggestion would be to switch to all underscores in the file names. If this is always the default, this should help avoid further similar mixups...

neavemj commented 2 years ago

No problem David!

Happy for you to quickly implement that change.

Best,

Matt.

On Mon, Jul 11, 2022 at 12:23 PM David Laehnemann @.***> wrote:

Thanks for so clearly identifying this problem, @neavemj https://github.com/neavemj. Would you like to suggest the respective needed change in a pull request, so we can document this as your contribution to the code base? Or would you prefer that I quickly implement the fix to the code base with an acknowledgement to this issue right here?

My suggestion would be to switch to all underscores in the file names. If this is always the default, this should help avoid further similar mixups...

— Reply to this email directly, view it on GitHub https://github.com/snakemake-workflows/rna-seq-star-deseq2/issues/42#issuecomment-1180222767, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABU4RR2VSQLLFI7S5FDD7NDVTPY3BANCNFSM5FD7SVKA . You are receiving this because you were mentioned.Message ID: @.***>

dlaehnemann commented 1 year ago

Finally closed by #61 .