oushujun / LTR_retriever

LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons; The LTR Assembly Index (LAI) is also included in this package.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5813529/
GNU General Public License v3.0
177 stars 40 forks source link

Assess animal genome assembly quality using LAI #54

Closed qiuyixmm closed 4 years ago

qiuyixmm commented 4 years ago

Personally, LTR_retriever is developed primarily to assess plants genome assembly quality. LAI requires a minimum of 5% total LTR and 0.1% intact LTR sequences present in the genome for the purpose of accurate evaluation. Theses two thresholds can be applied to plant genome well, where there are a large proportion of LTRs. By contrast, animals have much less LTR content in their genomes especially in most avian genome ( less 5% according to https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-3043-1). So I failed to run LTR_retriever to get LAI values assessing some birds' genome quality. The main reasonis that the threshold of LTR and intact LTR adopted by LTR_retriever is a little strict at least for avian genome. LTR_retriever provide a new and accurate method to assess genome quality. It is expected that a modified version of LTR_retriever is developed so that it can be appiled to wider range of species.

oushujun commented 4 years ago

Dear @qiuyixmm ,

Thank you for using LTR_retriever. The program was developed to identify and annotate LTR retrotransposons in both plant and animal genomes. The LAI metric is a nice byproduct generated during this process. As you pointed out, there is a minimum thredshould required for accurate evaluation because such measurement was based on the presence of LTR sequences. If there are not enough LTR sequences, the measurement will be inaccurate and misleading.

Image you need to measure the mean body length of a bird species and if you only have 3 samples, the measurement won't be able to represent the whole species range. For LAI, a minimum of 5% total LTR and 0.1% intact LTR sequences is the lowest threshold for accurate measurement. It's the same for both plant and animal genomes.

If your genome does not meet the measurement threshold, I am sorry. Alternatively, you may use other methods such as the bionano optical map to evaluate the contiguity of your genome.

Best, Shujun