Closed FionaMoon closed 1 year ago
Hello,Thanks for your interest.For paired end eCLIP data on the ENCODE Project website, the informative read is read 2, so please enter 2.Also note that all such data has been processed with Skipper and called site output is available on the corresponding FigShare page:Skipper RNA-protein interaction profilesfigshare.comBest,EvanCourtesy of my phoneOn Jul 2, 2023, at 10:46 PM, LY @.***> wrote: Hello, skipper team! I try to use skipper for GSE177848 which is a pair-end eCLIP data. The annotation of INFORMATIVE_READ In Skipper_config.py shows:
Single-end: enter 1. Paired-end: enter read (1 or 2) corresponding to crosslink site
I don't understand which one to choose (1/2) and why. Can you explain this for me? Thank you so much!
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>
Is this the figshare that should be linked?
Thank you for your answer!
FYI I recently hit an issue with incomplete fastqs, and I think the issue stems from this rule that runs two processes. Since one of them is run in the background, I'm not sure what happens when the second process finishes before the first, but my guess is that the exit code ends up being 0 when it shouldn't be. Evan do you know what the behavior is?
rule copy_with_umi:
input:
fq_1 = lambda wildcards: replicate_label_to_fastq_1[wildcards.replicate_label],
fq_2 = lambda wildcards: replicate_label_to_fastq_2[wildcards.replicate_label],
output:
fq_1 = temp("output/fastqs/copy/{replicate_label}-1.fastq.gz"), #SORT OUT!!
fq_2 = temp("output/fastqs/copy/{replicate_label}-2.fastq.gz"), #SORT OUT!!
threads: 2
params:
run_time = "6:00:00",
error_file = "stderr/{replicate_label}.copy_with_umi.err",
out_file = "stdout/{replicate_label}.copy_with_umi.out",
job_name = "copy_with_umi"
benchmark: "benchmarks/umi/unassigned_experiment.{replicate_label}.copy_with_umi.txt"
shell:
"zcat {input.fq_1} | awk 'NR % 4 != 1 {{print}} NR % 4 == 1 {{split($1,header,\":\"); print $1 \":\" substr(header[1],2,length(header[1]) - 1) }}' | gzip > {output.fq_1} &"
"zcat {input.fq_2} | awk 'NR % 4 != 1 {{print}} NR % 4 == 1 {{split($1,header,\":\"); print $1 \":\" substr(header[1],2,length(header[1]) - 1) }}' | gzip > {output.fq_2};"
That appears to be code that I wrote specifically for processing ENCODE 3 data downloaded from the ENCODE portal. It's not part of Skipper or meant to be run for anything else but you can certainly adapt it if it's useful.
That step is reformatting the UMI encoded in the fastqs so whether that line will run (e.g., successfully) depends on whether the header has the right delimiter.
Hello, skipper team! I try to use skipper for GSE177848 which is a pair-end eCLIP data. The annotation of INFORMATIVE_READ In Skipper_config.py shows:
I don't understand which one to choose (1/2) and why. Can you explain this for me? Thank you so much!