Closed billytaj closed 5 years ago
Hi,
That sounds odd. Can you try to copy/paste the exact command you used to run AdapterRemoval?
While I would not expect the output to be an empty file, running AdapterRemoval on a file without adapter sequences will still cause changes. The algorithm used by AdapterRemoval does not have perfect specificity, so false positives are to be expected. That said, I am only seeing a few potentially false positives when I run AdapterRemoval on those files.
Best regards, Mikkel
On Tue, Dec 18, 2018 at 6:22 PM Billy Taj notifications@github.com wrote:
Hi, I'm trying to use your tool on a dataset that has no adapters. However, the output of the program is a completely blank Fastq. Shouldn't it leave my file alone, if there are no adapters?
The data I used is from here: http://huttenhower.sph.harvard.edu/humann2 Their synthetic human gut rna sample. I am using factory default settings, and fastqc tells me this sample has no adapters.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MikkelSchubert/adapterremoval/issues/30, or mute the thread https://github.com/notifications/unsubscribe-auth/ACTMa2RQdPTZ7vIueT3QkBtcJ_s3tOPnks5u6SRlgaJpZM4ZYviN .
>&2 echo Removing adapters | /pipeline_tools/adapterremoval/AdapterRemoval --file1 /scratch/j/jparkin/billyc59/Humann2_benchmark_run/rna_synth/quality_filter/data/0_sorted_raw_input/pair_1_sorted.fastq --file2 /scratch/j/jparkin/billyc59/Humann2_benchmark_run/rna_synth/quality_filter/data/0_sorted_raw_input/pair_2_sorted.fastq --qualitybase 33 --threads 80 --minlength 30 --basename /scratch/j/jparkin/billyc59/Humann2_benchmark_run/rna_synth/quality_filter/data/1_adapter_removal_AdapterRemoval --trimqualities --output1 /scratch/j/jparkin/billyc59/Humann2_benchmark_run/rna_synth/quality_filter/data/1_adapter_removal/pair_1_adptr_rem.fastq --output2 /scratch/j/jparkin/billyc59/Humann2_benchmark_run/rna_synth/quality_filter/data/1_adapter_removal/pair_2_adptr_rem.fastq --singleton /scratch/j/jparkin/billyc59/Humann2_benchmark_run/rna_synth/quality_filter/data/1_adapter_removal/singletons_adptr_rem.fastq
That looks fine, though I don't think you'll gain much from using 80 threads.
You said that the output "is a completely blank Fastq". Does that apply to all of the resulting FASTQ files? That is to say, are pair_1_adptr_rem.fastq, pair_2_adptr_rem.fastq, and singletons_adptr_rem.fastq all empty?
On Tue, Dec 18, 2018 at 7:29 PM Billy Taj notifications@github.com wrote:
&2 echo Removing adapters | /pipeline_tools/adapterremoval/AdapterRemoval --file1 /scratch/j/jparkin/billyc59/Humann2_benchmark_run/rna_synth/quality_filter/data/0_sorted_raw_input/pair_1_sorted.fastq --file2 /scratch/j/jparkin/billyc59/Humann2_benchmark_run/rna_synth/quality_filter/data/0_sorted_raw_input/pair_2_sorted.fastq --qualitybase 33 --threads 80 --minlength 30 --basename /scratch/j/jparkin/billyc59/Humann2_benchmark_run/rna_synth/quality_filter/data/1_adapter_removal_AdapterRemoval --trimqualities --output1 /scratch/j/jparkin/billyc59/Humann2_benchmark_run/rna_synth/quality_filter/data/1_adapter_removal/pair_1_adptr_rem.fastq --output2 /scratch/j/jparkin/billyc59/Humann2_benchmark_run/rna_synth/quality_filter/data/1_adapter_removal/pair_2_adptr_rem.fastq --singleton /scratch/j/jparkin/billyc59/Humann2_benchmark_run/rna_synth/quality_filter/data/1_adapter_removal/singletons_adptr_rem.fastq
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MikkelSchubert/adapterremoval/issues/30#issuecomment-448321830, or mute the thread https://github.com/notifications/unsubscribe-auth/ACTMa9AC1e5abXvggk2hOkADIkBn0XgVks5u6TQPgaJpZM4ZYviN .
Threads: What are the parallelism limits on this program? I am using an 80-thread core machine to run your program.
outputs: yes, pair_1_adptr_rem.fastq, pair_2_adptr_rem.fastq and singletons_adptr_rem.fastq are completely blank.
Are they blank for you too?
There are no hard-coded limits, but most of those threads will probably end doing little more than waiting for the next chunk of FASTQ reads to be read and (subsequently) written.
Output looks fine for me, using the same options that you did, with the output file only being slightly smaller than the input (due to the aforementioned false positives).
I should have asked this earlier, but can you copy/paste or attach the STDERR output from AdapterRemoval? Also, what version are you using? See 'AdapterRemoval --version'.
On Tue, Dec 18, 2018 at 7:56 PM Billy Taj notifications@github.com wrote:
Threads: What are the parallelism limits on this program? I am using an 80-thread core machine to run your program.
outputs: yes, pair_1_adptr_rem.fastq, pair_2_adptr_rem.fastq and singletons_adptr_rem.fastq are completely blank.
Are they blank for you too?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MikkelSchubert/adapterremoval/issues/30#issuecomment-448330510, or mute the thread https://github.com/notifications/unsubscribe-auth/ACTMay_IggoGaOpvHoas7MPhpMVwjYuHks5u6TpYgaJpZM4ZYviN .
version: version 2.1.7. Is there a newer one?
Oh, I see. it's a malformed header
Trimming paired end reads ... Error reading FASTQ record at line 1; aborting: Malformed or empty FASTQ header
I don't really understand this error. Does this program need "/1" and "/2" at the end of each fastq ID for paired-end mode?
No, the /1 and /2 are not required. But if they are there, then they just have to make sense (i.e. a 1 and a 2). However, this particular error message is caused by the header line either being empty or not starting with '@'.
Try to 'head' your input files and let me know what the result is?
On Tue, Dec 18, 2018 at 8:30 PM Billy Taj notifications@github.com wrote:
I don't really understand this error. Does this program need "/1" and "/2" at the end of each fastq ID for paired-end mode?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MikkelSchubert/adapterremoval/issues/30#issuecomment-448341102, or mute the thread https://github.com/notifications/unsubscribe-auth/ACTMazYFFl8ERb40IhDFltiEMCOLKcXyks5u6UJDgaJpZM4ZYviN .
The fault is with my own code, and not an issue with yours. Thank you again for your help. My particular issue is due to a Pandas import error making a mess of my input file.
Hi, I'm trying to use your tool on a dataset that has no adapters. However, the output of the program is a completely blank Fastq. Shouldn't it leave my file alone, if there are no adapters?
The data I used is from here: http://huttenhower.sph.harvard.edu/humann2 Their synthetic human gut rna sample. I am using factory default settings, and fastqc tells me this sample has no adapters.