oushujun / LTR_retriever

LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons; The LTR Assembly Index (LAI) is also included in this package.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5813529/
GNU General Public License v3.0
177 stars 40 forks source link

Questions about running LTR_retriever #27

Closed rapaJiahe closed 5 years ago

rapaJiahe commented 5 years ago

Hi Shujun,

I have some question when I run LTR_retriver. I realy need your help.

  1. I can't fully understand redundance. Does redundance means that one LTR-RT have many copys? Removing redundance is to keep one of these copys?
  2. I used the results of LTR_finder and LTR_harvest as input, and the candidates are 7565 (LTR_finder), 5507(LTR_harvest) and 8246(no-TGCA candidates from LTR_harvest), finaly, I get 2203 LTR-RTs (non-redundant​). I learned from your article that these two software have some problems. I am very interested in your pipelines, but I don't know which of the final results contains all the reliable intact LTR-RTs of the whole genome?
  3. In your paper, you have given 5 categories. Can I think that the no-TGCA LTR-RT is also a type of Intact LTR-RT?
  4. The non-redundant LTR library is very important. however, I still want to know whether LTR_retirver can provide how many copies of each LTR-RT in non-redundant LTR library.

Best,

Jiahe Liu

oushujun commented 5 years ago

Hi Jiahe,

  1. basically yes, the non-redundant library is to keep the representing piece of an element. Please read the supplementary method section, an LTR-RT is split into three pieces: LTR-IN-LTR. These pieces are pooled together among others, then the representing piece will be kept based on the 80-90-100 rule (or similar, I forgot the exact number, please find it in the paper).

  2. This information is indicated in the screen output and also the manual. All reliable intact LTR-RTs are listed in the genome.fa.pass.list file, including the non-TGCA LTR-RTs.

  3. not sure what are the five categories, but yes, non-TGCA LTR-RTs listed in both *.pass.list files are qualified intact LTR-RTs, and also included in the library.

  4. This information is available in the *.out.fam.size.list file. Again this information is available in the manual.

Thanks, Shujun

rapaJiahe commented 5 years ago

Thank you for your answer, this will be very helpful to me.

Best,