oushujun / LTR_retriever

LTR_retriever is a highly accurate and sensitive program for identification of LTR retrotransposons; The LTR Assembly Index (LAI) is also included in this package.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5813529/
GNU General Public License v3.0
177 stars 40 forks source link

Ltr insertion time #44

Closed mqding0564 closed 5 years ago

mqding0564 commented 5 years ago

hi,this is a very good software, I noticed this software gives the insertion time, could you please explain what kind of algorithm you use to get the insertion time? And the unit of the insertion time column? Is it the year or not.

oushujun commented 5 years ago

Hello,

Thanks for using LTR_retriever. Insertion time of each intact LTR element is given in the .pass.list file. By default the mutation rate is 1.3e-8 per bp per year (rice), so the unit is calendar year. You may specify a different rate by providing -u/-miu. For the calculation algorithm please read our Plant Physiology paper.

Best, Shujun

On Sun, Apr 7, 2019, 7:28 AM mqding0564 notifications@github.com wrote:

hi,this is a very good software, I noticed this software gives the insertion time, could you please explain what kind of algorithm you use to get the insertion time? And the unit of the insertion time column? Is it the year or not.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/44, or mute the thread https://github.com/notifications/unsubscribe-auth/AFt-NB7bg8_A1tyd30t2ms4zsQOOwMbjks5veeRTgaJpZM4cgvBR .

U201412486 commented 4 years ago

Hi, I have the LTR gff3 file. How should I do If I just want to use LTR_retriever to compute the insertion time of the LTR ? Thanks, sun,

oushujun commented 4 years ago

Hi @U201412486,

Please see the information above. If you can find divergence or identity in the gff3 file, then you can calculate the insertion time by hand or excel with T = K/2µ, K is the divergence of the LTR = 1 - identity.

Best, Shujun

U201412486 commented 4 years ago

@oushujun Thank you for your answer. I need to caculate sequence identity between the 5' and 3' direct repeats of an LTR candidate firstly.I do not know how to extract 5' and 3' direct repeats sequnce of an LTR.Can you give me advice? many thanks, sun

oushujun commented 4 years ago

The easiest way is to look it up in the LTR_retriever output file such as gff3 or .pass.list. You may recalculate it by yourself. To extract a sequence, one solution is to use the LTR_retriever/bin/call_seq_by_list.pl script.

SHujun

On Sun, Jun 14, 2020 at 8:31 PM U201412486 notifications@github.com wrote:

@oushujun https://github.com/oushujun Thank you for your answer. I need to caculate sequence identity between the 5' and 3' direct repeats of an LTR candidate firstly.I do not know how to extract 5' and 3' direct repeats sequnce of an LTR.Can you give me advice? many thanks, sun

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/44#issuecomment-643857374, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NGEPLJAXSNIIQLSTJ3RWV2W5ANCNFSM4HEC6BIQ .

U201412486 commented 4 years ago

The LTR_retriever/bin/call_seq_by_list.pl can be used to extract sequences accroding to the coordinate.So how can I get the coordinate of the 5' and 3' direct repeats of an LTR? The thing is that I do not know how to idenified the coordinates of the 5' and 3' direct repeats of an LTR.

Best, sun

oushujun commented 4 years ago

Please find this info in gff3 or .pass.list.

On Tue, Jun 23, 2020 at 1:29 AM U201412486 notifications@github.com wrote:

The LTR_retriever/bin/call_seq_by_list.pl can be used to extract sequences accroding to the coordinate.So how can I get the coordinate of the 5' and 3' direct repeats of an LTR? The thing is that I do not know how to idenified the coordinates of the 5' and 3' direct repeats of an LTR.

Best, sun

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/44#issuecomment-647937593, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NG4CWF5J2YCHPDHE7LRYBDVPANCNFSM4HEC6BIQ .

U201412486 commented 4 years ago

Thank you so much. Can you tell me the process of extracting the 5' and 3' direct repeats of an LTR such as any script ?

oushujun commented 4 years ago

As I have mentioned above, you can use the info in the pass.list or gff3 file for the LTR coordinate and the call_seq_by_list.pl script to call these regions.

Shujun

On Tue, Jun 23, 2020 at 7:52 PM U201412486 notifications@github.com wrote:

Thank you so much. Can you tell me the process of extracting the 5' and 3' direct repeats of an LTR such as any script ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/44#issuecomment-648518357, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NBTSIVYNLTKRO4I2MTRYFE3BANCNFSM4HEC6BIQ .

Prabhu89-code commented 3 years ago

Dear Dr. Ou Shu Jun, Can I use the unit of "Insertion_Time" as Milion Years Ago (MYA) by dividing the result by 10,00,000? Fo eg., 595557 as 0.595 MYA ?

Thanking you! Regards, Prabhu, S

oushujun commented 3 years ago

@Prabhu89-code Yes you can. You can even reestimte the time with the mutation rate of your species (µ, per bp per year) using the identity with T = (1 - identity%) / 2µ.

Best, Shujun

Prabhu89-code commented 3 years ago

Thank you for your reply Dr. Shu Jun. I have never done mutation rate and insertion time calculation. I am working in Peach. I will try to collect information from literatures and reestimate with the formula you mentioned.

Thanking you!

Regards, Prabhu, S

Prabhu89-code commented 3 years ago

Dear Dr. Shu Jun, What about the identity percentage of 1. How do we estimate those LTR's insertion time.

Thanking you !

Regards, Prabhu, S

oushujun commented 3 years ago

That's not 1% but 100%. Age of 0. -Shujun

On Mon, May 17, 2021 at 2:42 PM Prabhu89-code @.***> wrote:

Dear Dr. Shu Jun, What about the identity percentage of 1. How do we estimate those LTR's insertion time.

Thanking you !

Regards, Prabhu, S

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/44#issuecomment-842046911, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NA2LT4DNL2AYMTZNQDTOC3FLANCNFSM4HEC6BIQ .

Prabhu89-code commented 3 years ago

So, the age of '0' is correct ? What it means in terms of insertion time.

-Prabhu

oushujun commented 3 years ago

That means no differentiation between the left and right LTR. The insertion could happen right before the sampling of the tissue or a couple hundred years ago. But there is no differentiation of the LTR pair so the age is estimated 0. -Shujun

On Mon, May 17, 2021 at 2:47 PM Prabhu89-code @.***> wrote:

So, the age of '0' is correct ? What it means in terms of insertion time.

-Prabhu

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/44#issuecomment-842050652, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NEOZ6GANPKKE4FBYH3TOC3XFANCNFSM4HEC6BIQ .

Prabhu89-code commented 3 years ago

Dear Dr. Shu Jun, Thank you very much for your clarification. Now I understood it completely. Your answers are so helpful in understanding about LTR.

Thanking you!

Regards, Prabhu, S

oushujun commented 3 years ago

I am glad it helps. -Shujun

On Thu, May 20, 2021 at 12:51 PM Prabhu89-code @.***> wrote:

Dear Dr. Shu Jun, Thank you very much for your clarification. Now I understood it completely. Your answers are so helpful in understanding about LTR.

Thanking you!

Regards, Prabhu, S

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/oushujun/LTR_retriever/issues/44#issuecomment-844690487, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NBUOP54XTSUBQGRJHTTOSILZANCNFSM4HEC6BIQ .

Wenwen012345 commented 2 years ago

Hello shujun, @oushujun

I noticed that some identities in the LTR_Retriever result file are less than 0.01, and then the insertion_time column shows up as "NA". How do you explain this? image

oushujun commented 2 years ago

Hello @wensulin93,

Do you see all of them or just a portion of them showing the identity less than 0.01? Has this file been modified after LTR_retriever generated it? Thank you.

Best, Shujun

Wenwen012345 commented 2 years ago

hello, @oushujun

Only part of it. There are dozens of them out of thousands, as shown in the picture. And only below 0.01 (jump directly from 0.9 to 0.01). The LTR_Retriever version I'm using is 2.9.0. Because I'm involved in calculating the insertion time, but if you start with identity value < 0.01 for insertion time would show significant outliers over billions of years (second picture), which is clearly not possible. Then I noticed that in gCA_019656295.1_rRI_R1.1_genom.fna.mod.pass, the default insertion time value (Identity < 0.01) is NA, so I excluded these identity <0.01 in my subsequent continuation. Then the distribution of insertion time seemed normal. It also seems that TEsorter can rule out these outliers.

image

image

oushujun commented 2 years ago

Hi,

Can you send me the .defalse file to further check on these candidates? Thank you.

Shujun shujun.ou.1@gmail.com

Wenwen012345 commented 2 years ago

Hi,

Can you send me the .defalse file to further check on these candidates? Thank you.

Shujun shujun.ou.1@gmail.com

Has been sent.

Wenwen012345 commented 2 years ago

Hi,

Can you send me the .defalse file to further check on these candidates? Thank you.

Shujun shujun.ou.1@gmail.com

Hello, @oushujun

Upon inspection, it looks like that the reason is LTR_Retriever running on the PC(Core i5-9400F,6 CORES, 16GB RAM). Initially I was running on a PC and got a lot of "<0.01 identity", as shown below. 下载 (2)

But once I ran it on the server (120G RAM, 40 CORES), there were almost no items less than 0.01 identity, as shown below。

下载 (3)

Is LTR_Retriever automatically skipping hard-to-handle projects due to computer capabilities? Please refer to and add appropriate instructions in the main page. Due to network reasons and time constraints, LTR_Retriever cannot be run for PC for further testing. I will further test in my spare time and inform you of the results!

By the way, the BLASTN version on the server is 2.5.0+; On the PC, it's 2.12.0+.

Thank you very much!

Wenwen012345 commented 1 year ago

Hi, Can you send me the .defalse file to further check on these candidates? Thank you. Shujun shujun.ou.1@gmail.com

Hello, @oushujun

Upon inspection, it looks like that the reason is LTR_Retriever running on the PC(Core i5-9400F,6 CORES, 16GB RAM). Initially I was running on a PC and got a lot of "<0.01 identity", as shown below. 下载 (2)

But once I ran it on the server (120G RAM, 40 CORES), there were almost no items less than 0.01 identity, as shown below。

下载 (3)

Is LTR_Retriever automatically skipping hard-to-handle projects due to computer capabilities? Please refer to and add appropriate instructions in the main page. Due to network reasons and time constraints, LTR_Retriever cannot be run for PC for further testing. I will further test in my spare time and inform you of the results!

By the way, the BLASTN version on the server is 2.5.0+; On the PC, it's 2.12.0+.

Thank you very much!

Hi, Can you send me the .defalse file to further check on these candidates? Thank you. Shujun shujun.ou.1@gmail.com

Hello, @oushujun

Upon inspection, it looks like that the reason is LTR_Retriever running on the PC(Core i5-9400F,6 CORES, 16GB RAM). Initially I was running on a PC and got a lot of "<0.01 identity", as shown below. 下载 (2)

But once I ran it on the server (120G RAM, 40 CORES), there were almost no items less than 0.01 identity, as shown below。

下载 (3)

Is LTR_Retriever automatically skipping hard-to-handle projects due to computer capabilities? Please refer to and add appropriate instructions in the main page. Due to network reasons and time constraints, LTR_Retriever cannot be run for PC for further testing. I will further test in my spare time and inform you of the results!

By the way, the BLASTN version on the server is 2.5.0+; On the PC, it's 2.12.0+.

Thank you very much!

This was a previous question, but I recently tried running LTR-Retriever in my PC. I use a virtual machine, linux system, ubuntu. Results showed no abnormalities. different from the previous ones. I deliberately put the files required for running on the hard disk in the virtual machine to run (it was run on a shared hard disk before.). The results show that there are only a few LTRs (3) with Identity<0.01, indicating that the software runs without problems. Maybe it was because the running file was placed in the shared folder before? In short, the LTR_Retriever running on the PC is displayed normally. Although some of the analysis I usually do in the biographies is mainly done on the server.

tytrhr commented 5 months ago

Hello, @oushujun If the -u parameter is not added, is the default evolutionary rate calculated for rice 1.3e-8?