sanger-pathogens / iva

de novo virus assembler of Illumina paired reads
http://sanger-pathogens.github.io/iva/
Other
54 stars 18 forks source link

IVA sensitivity threshold #72

Closed antoine4ucsd closed 7 years ago

antoine4ucsd commented 7 years ago

Hi, I am new with IVA. I am analyzing Illumina HIV data (paired end) with reads ~250-350 bp covering regions of up to ~2kb. With my local pipeline, I cleaned up and filtered the reads and got the allele frequencies along these regions. I was willing to use IVA for reconstructing representative variants for these regions. With the standard command, it only generated one contig. I was wondering whether I could adjust some of the options to generate less predominant variants and not only the contig? I tried various ways (e.g. smalt_id threshold) but I got one contig only though I am expecting some diversity. Is there anything you could suggest as parameters to adapt to get more representative /less frequent haplotypes? Can we get the contigs relative frequencies along with the contigs themselves?

thanks!!!!

example: iva --smalt_id 0.01 --fr Data_Interleaved.fastq Output_dir

martinghunt commented 7 years ago

Hi,

The aim of IVA is to collapse everything down into one contig, which should represent the most common sequence in the reads. sorry, but investigating the various alleles/variants in the sample is a separate problem that IVA is not trying to solve. I suggest mapping the reads back to the contig and go from there.

antoine4ucsd commented 7 years ago

thanks. I will go from there as you suggest.