cbg-ethz / ConsensusFixer

Computes a consensus sequence with wobbles, ambiguous bases, and in-frame insertions, from a NGS read alignment.
GNU General Public License v3.0
18 stars 3 forks source link

optimizing the settings for HIV #9

Open antoine4ucsd opened 6 years ago

antoine4ucsd commented 6 years ago

Hi, I am using ConsensusFixer to get FL consensi from cleaned bam reads covering HIV FL genome (MiSeq Illumina) One limitation is that coverage can be very heterogeneous or incomplete for some samples (from <100 reads to 10k or more..). I would like to get some advices to optimize mcc, mic, and pluralityN given this sequencing depth heterogeneity. I tried looping over my data with various threshold but all advices would be very welcome! I can also share a couple of bam file if it helps...

to get ambiguous sites above 5%, I used the following

java -XX:NewRatio=9 -XX:+UseParallelGC -Xms2G -Xmx10G  -jar  ConsensusFixer.jar   -i my.bam -r HXB2.fa -mcc 100 -plurality 0.05  -pluralityN 0.8  -mic 100  -f -mi -pi  -o myconsensus.fa

and to get ambiguous sites above 20%, I just changed -plurality to 0.2

thank you!