COMBINE-lab / salmon

🐟 🍣 🍱 Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using selective alignment
https://combine-lab.github.io/salmon
GNU General Public License v3.0
778 stars 165 forks source link

No unique counts with validateMappings #347

Closed red-plant closed 5 years ago

red-plant commented 5 years ago

Dear Salmon team,

I am trying to quantify allele specific expression using salmon, so I would like to use the unique counts to estimate confidence itervals of the allele unbalance. However, I get no unique counts in the ambig_info.tsv file, disabling the validateMappings option fixed it. I'm using Salmon v 12 and I do expect unique counts, since I have some reads aligning to variants, as determined by featureCounts on the output sam by this same salmon run.

Is this the expected behavior when one enables validateMappings? Can I just go without validating them? I noticed the results are very close when disabling this option.

Salmon was run as follows using a default k=31 quasi index. salmon quant --writeMappings=Z --no-version-check -p10 (--validateMappings) --seqBias --posBias -i X -l IU -1 P.fq.gz -2 Qq.fq.gz -o Test

rob-p commented 5 years ago

Hi @red-plant,

This is actually an issue that is a result of the range-factorized equivalence classes that are induced by the validate mappings option. We noticed this side-effect of range-factorization in our own testing, and the issue causing it was fixed in 0.13.0. However, it is worth noting that --validateMappings will generally map reads in a much more sensitive way than the default quasi-mapping, and so it is likely that if a read maps to one allele, it will also map to the other but with a lower alignment score (which the algorithm accounts for during quantification). If you really only want to consider the best mappings for a read, and not weight read assignments by alignment score, then you can use the --hardFilter option that is also introduced in 0.13.0.

Best, Rob

red-plant commented 5 years ago

Thanks Dr Patro, Updating now, In my simulations weighted assignments perform quite better than 'best mappings' for ASE, so will stick with that. Best.