Closed novitch closed 5 years ago
Thank you very much for bringing this to our attention, Alban!
We will do some benchmarks. It is very likely it will influence results, but I think it will be very unlikely to have a major impact (because the major influence will take place at most noisy parts of the pangenome if it makes any sense). We have been using pangenomics in various environments, and have been carefully inspecting our gene clusters to make sure they are reasonable proxies to biological insights. But we will investigate this and do our best to address this (perhaps by adding a flag to tell anvi'o to be very stringent optionally), and update our tutorials.
Best,
Yes I agree, I was using this option since a long time and observing logical biological interpretations. But I was afraid when reading this paper this morning, I also will test my datasets.
Cheers,
Please keep us posted. We will do the same using this issue.
No worries.
It turns out, pangenomic analyses do not use this flag :)
COG searches do. I will look into that in a separate issue.
Ok good news!
Someone send me this paper today https://academic.oup.com/bioinformatics/advance-article-abstract/doi/10.1093/bioinformatics/bty833/5106166 , explaining that max-target-seq = 1 option do not select best hit, but the first sequence that matched.
and I was wondering if the blastp analysis inside anvi-pangenome analysis is impacted