zhxiaokang / RASflow

RNA-Seq analysis workflow
MIT License
105 stars 58 forks source link

Wildcards for fastq files #13

Closed JFK24 closed 4 years ago

JFK24 commented 4 years ago

File path built with wildcards do not recognize "sample1.fastq.gz" as a single file if "sample10.fastq.gz" is in same directory, instead it returns a list with the 2 files.

My way to fix it:

IN align_count_genome.rules AND quantify_trans.rules CHANGE shell("scp -i {params.key} {params.input_path}/{wildcards.sample}*.f*q.gz {output.read}") TO shell("scp -i {params.key} {params.input_path}/{wildcards.sample}.f*q.gz {output.read}")

zhxiaokang commented 4 years ago

Well spotted! It's a good point that I didn't think of. Thank you!

But your solution misses this type of file naming: {wildcards.sample}.something_else.f*q.gz. And this is the reason I put a * after {wildcards.sample}. But now I see the issue caused by the *.

I don't have a better solution so far. I've posted the question on StackExchange

Let's see whether we can get a better solution.

zhxiaokang commented 4 years ago

Got some kind suggestions from StackExchange. I found shopt -s extglob very simple and useful. Could you try it out to see if it works for you?

rule getReads:
    output:
        read = temp(intermediate_path + "/reads/{sample}.fastq.gz")
    params:
        key = key,
        input_path = input_path
    shell:
        """
        shopt -s extglob
        scp -i {params.key} {params.input_path}/{wildcards.sample}?(.*).f*q.gz {output.read}
        """
JFK24 commented 4 years ago

It works for my case, thanks!

On Mon, 6 Jul 2020 at 12:00, xkzhang notifications@github.com wrote:

Got some kind suggestions from StackExchange https://unix.stackexchange.com/questions/596551/how-to-use-shell-glob-as-what-does-in-regular-expression. I found shopt -s extglob very simple and useful. Could you try it out to see if it works for you?

rule getReads: output: read = temp(intermediate_path + "/reads/{sample}.fastq.gz") params: key = key, input_path = input_path shell: """ shopt -s extglob scp -i {params.key} {params.input_path}/{wildcards.sample}?(.).fq.gz {output.read} """

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/zhxiaokang/RASflow/issues/13#issuecomment-654138031, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHPU4NXRRFMMNPLWNMKQBWLR2GODLANCNFSM4ONOHKRQ .

zhxiaokang commented 4 years ago

Fixed with commit