oushujun / LTR_retriever

LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons; The LTR Assembly Index (LAI) is also included in this package.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5813529/
GNU General Public License v3.0
179 stars 40 forks source link

Change my GFF3 legacy #13

Closed intirules closed 6 years ago

intirules commented 6 years ago

Hi, so im already running LTR_Retriever and im happy with the results, now im trying to also run LTR_Digest to my [result.mod.pass.list.gff3] but it doesn't accept my GFF3 LTR_Retriever legacy. $ gt ltrdigest: error: no description matched sequence ID 'seq0' When using just harvest a solution to this "https://github.com/genometools/genometools/issues/882" was to force LTRharvest to output its candidates in the current format, using the -tabout no and -seqids options. Im asking myself if theres an option to do this in LTR_Retriever. I wanna use Digest so i can take advantage of the pHMMs files this program use. Thx

oushujun commented 6 years ago

Hi,

Thanks for requesting this feature. I made some changes in the code so that the output gff3 format is readable to LTR_digest.

Please update your version of LTR_retriever and do the following: perl ${path_to_LTR_retriever}/bin/make_gff3.pl Muschr4.fsa Muschr4.fsa.pass.list gt ltrdigest -hmms ${path_to_LTR_retriever}/database/TEfam.hmm -aaout -outfileprefix Muschr4.fsa.pass.list -seqfile Muschr4.fsa -matchdescstart < Muschr4.fsa.pass.list.gff3 > Muschr4.fsa.pass.list.digest.gff3

Let me know if you have further questions.

Best, Shujun

OluchiAroh commented 4 years ago

Hi @intirules , did you get this to work?.If yes, do you mind telling me how you went about it? I am currently trying to do this and i am getting the error below

gt ltrdigest: error: inconsistent strands encountered in `LTR_retrotransposon' feature in file sorted_genome.final.fa.pass.list.gff3, line 23753: found +, expected -

The command i used is gt suffixerator -dna -db genome.final.fa -tis -suf -lcp -des -ssp -sds -lossless gt gff3 -sort genome.final.fa.pass.list.gff3 > sorted_genome.final.fa.pass.list.gff3 gt -j 10 ltrdigest -hmms Pfam-A.hmm -aaout yes -outfileprefix genome.final.fa.pass.list -seqfile genome.final.fa -matchdescstart sorted_genome.final.fa.pass.list.gff3 > genome.final.fa.pass.list.digest.gff3