Closed nsheff closed 5 years ago
this command:
> `(fastp --overrepresentation_analysis --thread 8 \
--in1 /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastq/GSM3618128_R1.fastq --adapter_sequence TGGAATTCTCGGGTGCCAAGG \
--length_required 18 --html /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastqc/GSM3618128_R1_rmAdapter.html \
--json /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastqc/GSM3618128_R1_rmAdapter.json --report_title 'GSM3618128' \
-o /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastq/GSM3618128_R1_noadap.fastq )\
2> /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastqc/GSM3618128_R1_rmAdapter.txt | seqtk trimfq -b 0 -L 30 - | seqtk seq -r - \
> /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastq/GSM3618128_R1_processed.fastq` (13181,13183,13190)
<pre>
</pre>
Command completed. Elapsed time: 0:01:48. Running peak memory: 0.019GB.
PID: 13181; Command: fastp; Return code: 0; Memory used: 0.019GB
PID: 13190; Command: seqtk; Return code: 0; Memory used: 0.019GB
PID: 13183; Command: seqtk; Return code: 0; Memory used: 0.019GB
appears to be creating an empty file in GSM3618128_R1_processed.fastq
This large fastp command is direction stdout into the GSM3618128_R1_processed.fastq
file, but I can confirm that the command does not actually have any stdout...
It appears instead to be using the -o
flag to direct the "read1 ougtput file" into the R1_noadap.fastq file:
(fastp --overrepresentation_analysis --thread 8 --in1 /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastq/GSM3618128_R1.fastq --adapter_sequence TGGAATTCTCGGGTGCCAAGG --length_required 18 --html /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastqc/GSM3618128_R1_rmAdapter.html --json /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastqc/GSM3618128_R1_rmAdapter.json --report_title 'GSM3618128' -o /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastq/GSM3618128_R1_noadap.fastq )
Read1 before filtering: total reads: 47446064 total bases: 2080338564 Q20 bases: 2025600908(97.3688%) Q30 bases: 2008391701(96.5416%)
Read1 after filtering: total reads: 21982678 total bases: 930521629 Q20 bases: 907046935(97.4773%) Q30 bases: 898275329(96.5346%)
Filtering result: reads passed filter: 21982678 reads failed due to low quality: 243070 reads failed due to too many N: 178 reads failed due to too short: 25220138 reads with adapter trimmed: 29836086 bases trimmed due to adapters: 945959657
Duplication rate (may be overestimated since this is SE data): 11.8692%
JSON report: /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastqc/GSM3618128_R1_rmAdapter.json HTML report: /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastqc/GSM3618128_R1_rmAdapter.html
fastp --overrepresentation_analysis --thread 8 --in1 /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastq/GSM3618128_R1.fastq --adapter_sequence TGGAATTCTCGGGTGCCAAGG --length_required 18 --html /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastqc/GSM3618128_R1_rmAdapter.html --json /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastqc/GSM3618128_R1_rmAdapter.json --report_title GSM3618128 -o /ext/yeti/processed/ppqc_test/results_pipeline/GSM3618128/fastq/GSM3618128_R1_noadap.fastq fastp v0.20.0, time used: 95 seconds
I notice that on some lines, processed_fastq is the target of a redirect:
other times it's a target of -o
:
also it only shows up as a variable on single-end; https://github.com/databio/peppro/blob/644ca073809813e1474793601a9b2df3acd4f74c/pipelines/peppro.py#L167-L194
could the error be that it's supposed to be a target of -o
, which is actually writing output in these commands?