sanger-pathogens / iva

de novo virus assembler of Illumina paired reads
http://sanger-pathogens.github.io/iva/
Other
54 stars 18 forks source link

multithreading smalt error #71

Closed lmoncla closed 7 years ago

lmoncla commented 7 years ago

When running IVA on full genome influenza samples and enabling multiple threads, I get the following smalt error:

smalt.c:807 ERROR: The two FASTA/FASTQ input file have different numbers of reads [W::sam_read1] parse error at line 812880 [main_samview] truncated file.

I do not get this error if I run the same sample with a single thread.

martinghunt commented 7 years ago

Could you share the input reads with me please so I can try to debug? Thanks.

lmoncla commented 7 years ago

Hi,

Here they are! They have already been trimmed with trimmomatic. Thanks so much.

Best, ​ fastqs.zip https://drive.google.com/file/d/0B0s7NWreQ5QqWUpNcnFxcVB4LVE/view?usp=drive_web ​ Louise

On Tue, Feb 14, 2017 at 5:14 AM, martinghunt notifications@github.com wrote:

Could you share the input reads with me please so I can try to debug? Thanks.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sanger-pathogens/iva/issues/71#issuecomment-279679612, or mute the thread https://github.com/notifications/unsubscribe-auth/AIgbALLC_dMnKgeAyq6S1OIvqE-6Ier_ks5rcYyegaJpZM4L_nFS .

martinghunt commented 7 years ago

Thanks. What version of IVA did you use and what was the command to run it?

lmoncla commented 7 years ago

IVA version 1.0.8 Using kmc version 2.1.1 Using kmc_dump version 2.1.1 Using nucmer version 3.1 Using samtools version 1.3.1 Using smalt version 0.7.6

iva -f 6390_S23_L001_R1_001.trimmed.fastq -r 6390_S23_L001_R2_001.trimmed.fastq IVA_output

On Tue, Feb 14, 2017 at 10:12 AM, martinghunt notifications@github.com wrote:

Thanks. What version of IVA did you use and what was the command to run it?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sanger-pathogens/iva/issues/71#issuecomment-279752925, or mute the thread https://github.com/notifications/unsubscribe-auth/AIgbAACWUktxVjd5olu71k0hjMtu5cYZks5rcdJcgaJpZM4L_nFS .

martinghunt commented 7 years ago

I don't yet know why you didn't get the error with a single thread, but the reason for the error is that there needs to be the same number of reads in the two files. And the N^th read in each file should be mate pairs.

$ wc -l *
  1989088 6390_S23_L001_R1_001.trimmed.fastq
  1657760 6390_S23_L001_R2_001.trimmed.fastq

If you trim with trimmomatic, then you'll need to only keep the paired reads. Throw away everything else, ie when one read of a pair is removed by trimmomatic.

lmoncla commented 7 years ago

Thank you, I will try that. I assume that it is not possible to run IVA on unpaired reads?

On Tue, Feb 14, 2017 at 10:40 AM, martinghunt notifications@github.com wrote:

I don't yet know why you didn't get the error with a single thread, but the reason for the error is that there needs to be the same number of reads in the two files. And the N^th read in each file should be mate pairs.

$ wc -l * 1989088 6390_S23_L001_R1_001.trimmed.fastq 1657760 6390_S23_L001_R2_001.trimmed.fastq

If you trim with trimmomatic, then you'll need to only keep the paired reads. Throw away everything else, ie when one read of a pair is removed by trimmomatic.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sanger-pathogens/iva/issues/71#issuecomment-279761907, or mute the thread https://github.com/notifications/unsubscribe-auth/AIgbAHivyMEqmsuJV2c87yoNM3H3XJR8ks5rcdkXgaJpZM4L_nFS .

martinghunt commented 7 years ago

No, sorry but the read pair information is essential to its algorithm.

lmoncla commented 7 years ago

I see, I thought so. Thanks for the very quick reply, I really appreciate it!

On Tue, Feb 14, 2017 at 10:44 AM, martinghunt notifications@github.com wrote:

No, sorry but the read pair information is essential to its algorithm.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sanger-pathogens/iva/issues/71#issuecomment-279763109, or mute the thread https://github.com/notifications/unsubscribe-auth/AIgbAANS_3Wf0VQ-TdBx_RnZBUBo_lMBks5rcdn-gaJpZM4L_nFS .