oushujun / EDTA

Extensive de-novo TE Annotator
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1905-y
GNU General Public License v3.0
349 stars 73 forks source link

NO SINE,EDTA v2.2.0 #509

Open Apl-cc opened 1 month ago

Apl-cc commented 1 month ago

Hi Dr. Shujun,

EDTA is a very good software! Recently, I also encountered a problem when annotating TE sequences in the IWGSCv1p1A.fa genome with EDTA. The error message is as follows: No SINE results were found! v2.2.0

Start to find SINE candidates.

cp: cannot stat 'IWGSCv1p1A.fa.mod.SINE.raw.fa': No such file or directory Error: SINE results not found!

ERROR: Raw SINE results not found in IWGSCv1p1A.fa.mod.EDTA.raw/IWGSCv1p1A.fa.mod.SINE.raw.fa If you believe the program is working properly, this may be caused by the lack of SINEs in your genome.

The IWGSC genome has three subgenomes (A, B, and D). When I annotated the B genome, it worked properly, but the A and D genomes encountered the above problems, In addition, I also successfully annotated a genome using EDTA. And these files were generated in the raw/SINE folder

1359834c85e311efadb2b4055db3aef5 55a8c0b0857e11ef9caab4055db3aef5 b8f5b0b6857811efa898b4055db3a915 d0ad680c801911efb3ddb4055db3a915 HMM_out IWGSCv1p1A.fa_0150c31885e311efadb2b4055db3aef5.mod IWGSCv1p1A.fa_190dd9887fcf11efb3ddb4055db3a915-matches.fasta IWGSCv1p1A.fa_190dd9887fcf11efb3ddb4055db3a915.mod IWGSCv1p1A.fa_456360a2857e11ef9caab4055db3aef5.mod IWGSCv1p1A.fa_a8b2e714857811efa898b4055db3a915.mod IWGSCv1p1A.fa.mod -> ../../IWGSCv1p1A.fa.mod Step1_extend_tsd_input_1.fa Step1_extend_tsd_input_2.fa Step1_extend_tsd_input.fa Step2_extend_blast_input.fa Step2_extend_blast_input_rename.fa Step2_tsd_output.fa Step2_tsd.txt Step3_blast_output.paf

Any good suggestions to solve this problem?

pengliang

oushujun commented 1 month ago

Please check the SINE folder in the A D subgenome. It may be too large for AnnoSINE to handle.

Shujun

On Tue, Oct 8, 2024 at 11:30 PM Apl-cc @.***> wrote:

Hi Dr. Shujun,

EDTA is a very good software! Recently, I also encountered a problem when annotating TE sequences in the IWGSCv1p1A.fa genome with EDTA. The error message is as follows: No SINE results were found! v2.2.0

Start to find SINE candidates.

cp: cannot stat 'IWGSCv1p1A.fa.mod.SINE.raw.fa': No such file or directory Error: SINE results not found!

ERROR: Raw SINE results not found in IWGSCv1p1A.fa.mod.EDTA.raw/IWGSCv1p1A.fa.mod.SINE.raw.fa If you believe the program is working properly, this may be caused by the lack of SINEs in your genome.

The IWGSC genome has three subgenomes (A, B, and D). When I annotated the B genome, it worked properly, but the A and D genomes encountered the above problems, In addition, I also successfully annotated a genome using EDTA. And these files were generated in the raw/SINE folder

1359834c85e311efadb2b4055db3aef5 55a8c0b0857e11ef9caab4055db3aef5 b8f5b0b6857811efa898b4055db3a915 d0ad680c801911efb3ddb4055db3a915 HMM_out IWGSCv1p1A.fa_0150c31885e311efadb2b4055db3aef5.mod IWGSCv1p1A.fa_190dd9887fcf11efb3ddb4055db3a915-matches.fasta IWGSCv1p1A.fa_190dd9887fcf11efb3ddb4055db3a915.mod IWGSCv1p1A.fa_456360a2857e11ef9caab4055db3aef5.mod IWGSCv1p1A.fa_a8b2e714857811efa898b4055db3a915.mod IWGSCv1p1A.fa.mod -> ../../IWGSCv1p1A.fa.mod Step1_extend_tsd_input_1.fa Step1_extend_tsd_input_2.fa Step1_extend_tsd_input.fa Step2_extend_blast_input.fa Step2_extend_blast_input_rename.fa Step2_tsd_output.fa Step2_tsd.txt Step3_blast_output.paf

Any good suggestions to solve this problem?

pengliang

— Reply to this email directly, view it on GitHub https://github.com/oushujun/EDTA/issues/509, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NACCTBM2BEEP3RILXDZ2SPN5AVCNFSM6AAAAABPTSDOX2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGU3TINRTG4ZTCMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Apl-cc commented 1 month ago
Snipaste_2024-10-17_14-31-50

Maybe the file is too large to be processed, as you said. How to solve this problem?

pengliang

oushujun commented 1 month ago

You will need to increase your memory allocation. Looks like not a huge result, probably a modern server can handle it.

Shujun

On Thu, Oct 17, 2024 at 2:33 AM Apl-cc @.***> wrote:

Snipaste_2024-10-17_14-31-50.png (view on web) https://github.com/user-attachments/assets/bed419f0-6a44-4626-a10e-44e11cbefdae Maybe the file is too large to be processed, as you said. How to solve this problem?

pengliang

— Reply to this email directly, view it on GitHub https://github.com/oushujun/EDTA/issues/509#issuecomment-2418642580, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NDL7JKTRSV3WGRYCVDZ35K5FAVCNFSM6AAAAABPTSDOX2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJYGY2DENJYGA . You are receiving this because you commented.Message ID: @.***>

ifoo1213 commented 3 weeks ago

Snipaste_2024-10-17_14-31-50 Maybe the file is too large to be processed, as you said. How to solve this problem? pengliang

Hi, Pengliang, have you resolved this issue, I have the same issue as yours. not sure how to increase the memory here? set the memory in the command?

oushujun commented 3 weeks ago

@ifoo1213 you need to use a server with larger memory. You don't need to set anything in EDTA, it will consume necessary memory it requires, but if your server does not have sufficient memory, it will exit with OOM (out-of-memory) error.

Apl-cc commented 3 weeks ago

@oushujun When encountering such a situation, add --force 1 after the command and the task can be completed. Does adding this parameter have any effect on the result?

oushujun commented 3 weeks ago

Lacking SINE/LINE won’t interrupt the run. Please update your code to the latest.

Shujun

On Wed, Oct 30, 2024 at 9:07 AM Apl-cc @.***> wrote:

@oushujun https://github.com/oushujun When encountering such a situation, add --force 1 after the command and the task can be completed. Does adding this parameter have any effect on the result?

— Reply to this email directly, view it on GitHub https://github.com/oushujun/EDTA/issues/509#issuecomment-2447085129, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NEK4NBMGSI6ODQCWJTZ6DKYJAVCNFSM6AAAAABPTSDOX2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBXGA4DKMJSHE . You are receiving this because you were mentioned.Message ID: @.***>

jwli-code commented 2 weeks ago

Why is the proportion of SINE annotations much lower in the new version of EDTA compared to the previous version

jwli-code commented 2 weeks ago

tRNA/NA not found in the TE_SO database, it will not be used to rename sequences in the final annotation.