oushujun / LTR_retriever

LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons; The LTR Assembly Index (LAI) is also included in this package.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5813529/
GNU General Public License v3.0
176 stars 40 forks source link

BLASTN replaced by BLAT #112

Closed cherrie-g closed 2 years ago

cherrie-g commented 2 years ago

Dear Dr.Ou I am evaluating a plant genome using LTR_retriever. My genome size is about 3Gb, which I think not that large. But in the step LAI, I saw there were alignment using blastn in Age_est.pl. When I run this step, it took a long time and needed large memory quota. So I am wondering if I can replace the BLASTN alignment by BLAT. I am not sure if the replacement would loss some accuracy of the estimation. Could you provide some advice?

Bests.

oushujun commented 2 years ago

Hi @cherrie-g,

You may use the -q parameter to accelerate the estimation. This parameter will perform a three-point blastn, then apply a linear function to estimate the overall LTR identity.

I have not tested blat on LAI myself, but the general idea should be OK. You need to make sure blat generates the same information as blastn does, then compare LAI result with the blastn version. If they agree well, you can use blat to replace blastn. Please let me know how this goes.

Shujun

cherrie-g commented 2 years ago

Hi Dr.Ou, I have made a test. In the quick mode, I tried using BLAST and BLAT, and compare the results. The Mean identity of BLAST result was 91.8132379005359 and BLAT result was 93.0993950426133. And the LAI calculated from BLAST was 15.51 and from BLAT result was 12.41. But in my test I found that BLAT would output more number of result than BLAST, that maybe slow down the speed and affect the mean identity. So I think BLAST may not replaced by BLAT unless we explore more suitable parameters in BLAT.

Bests.

oushujun commented 2 years ago

Hi @cherrie-g,

Thank you for testing out BLAT as a potential substitution of BLAST. It seems like the output of BLAT is not identical to BLAST and there will be a need to reestimate the correction factor for raw LAI if BLAT is used.

Best, Shujun