Closed demis001 closed 4 years ago
I'll take a look to see what is going on.
This is the version I am using
> ./trim_galore --version
Quality-/Adapter-/RRBS-/Speciality-Trimming
[powered by Cutadapt]
version 0.6.4_dev
Last update: 24 09 2019
Hmm, the same command
trim_galore -j 4 --paired --retain_unpaired --2colour 20 --clip_r1 6 --clip_r2 6 --three_prime_clip_r1 10 --three_prime_clip_r2 10 --output_dir trimmed_fastq smallRNA_100K_R1.fastq.gz smallRNA_100K_R2.fastq.gz
with some local test files results in the following output folder:
$ ls trimmed_fastq
smallRNA_100K_R1.fastq.gz_trimming_report.txt
smallRNA_100K_R1_unpaired_1.fq.gz
smallRNA_100K_R1_val_1.fq.gz
smallRNA_100K_R2.fastq.gz_trimming_report.txt
smallRNA_100K_R2_unpaired_2.fq.gz
smallRNA_100K_R2_val_2.fq.gz
So it all seems to work well over here. I am not really sure why this happening at your end. My version is: 0.6.5
I run over 100 samples, this happens for 20 of them. I am getting the correct output for the rest 80 samples. I got low alignment efficiency and checked back and that is what happened. I will try the older versions.
Maybe it had to do with the parallel processing, and file synchronization issues? I would probably first try out the same command but dropping the -j 4
(and upgrading to the latest version).
I also suspected that and running without it for one of the failed sample right now.
I also merged the data from two lanes before doing that, do you think that will create a problem?
cat XXX_6008191213A6/TG6_L00[12]_R1_001.fastq.gz XXXX191220B6/TG6_L00[12]_R1_001.fastq.gz
XXX_6008_merged/TG6_L001_R1.fq.gz
On Wed, Jan 15, 2020 at 11:54 AM Felix Krueger notifications@github.com wrote:
Maybe it had to do with the parallel processing, and file synchronization issues? I would probably first try out the same command but dropping the -j 4 (and upgrading to the latest version).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/FelixKrueger/TrimGalore/issues/75?email_source=notifications&email_token=ACCPKKSA5MXX6GZ7LPNNASLQ545WFA5CNFSM4KHGJGJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJBAOOY#issuecomment-574752571, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACCPKKXWXXUXXRLHK5OQO7DQ545WFANCNFSM4KHGJGJA .
I am not sure if this could have do with it. I remember from some time ago (specifically for FastQC processing of merged FastQ files) that under certain circumstances
cat *_L00[12]_R1_001.fastq.gz > merged.fastq.gz
was not equivalent to:
zcat *_L00[12]_R1_001.fastq.gz | gzip -c - > merged_fastq.gz
It had something to do with (invisible) headers in the files (this might be googlable).
I see this in one of the log file? What does that mean, what is the possible way to fix it.
Read 2 output is truncated at sequence count: 57529710, please check your paired-end input files! Terminating...
Ah there we go, some files did not have the same number of sequences.... any chance it has do do with the merging somehow?
What happened was, the sequencing core sent me something like this R1 R2 R2 then repeated R1 R2 R1 R2 in second lane. I didn't want to throw some of it.
In lane 1, I have R1 In lane 2, I have R1, R2
I then merged R1 from lane 1 and lane 2, used R2 from lane 2. The size of R1 is bigger than R2
Thanks, found the problem! It is created with unequal number of R1 and R2.
@demis001
OK good, I am hopeful that you can get that sorted. Closing this issue if that's OK
@FelixKrueger
I am keep getting intermediate file at the end, any idea?
rim_galore -j 4 --paired --retain_unpaired --2colour 20 --clip_r1 6 --clip_r2 6 --three_prime_clip_r1 10 --three_prime_clip_r2 10 --output_dir trimmed_fastq TG6_L001_R1.fq.gz TG6_L001_R2.fq.gz
Output:
TG6_L001_R1_trimmed.fq.gz TG6_L001_R1_unpaired_1.fq.gz TG6_L001_R1_val_1.fq.gz TG6_L001_R2_trimmed.fq.gz TG6_L001_R2_unpaired_2.fq.gz TG6_L001_R2_val_2.fq.gz
This happened for many samples. I checked, no error and the run completed without problem. I am using the current cutadpt and trim_galore.
@demis001