oushujun / EDTA

Extensive de-novo TE Annotator
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1905-y
GNU General Public License v3.0
334 stars 73 forks source link

Further classification of DNA transposon super-families #41

Closed JunpengShi closed 4 years ago

JunpengShi commented 4 years ago

Hi Shujun, Is there any method to further classify the DNA transposon that name as DNA/DTT, DNA/DTA by EDTA into specific superfamily names such as Harbinger, Mu, AC/DS and others?

Best regards, Junpeng

oushujun commented 4 years ago

Dear Junpeng,

They are the same thing but in different naming systems. You may write a simple script to convert the three-letter names to conventional names.

Best, Shujun

On Thu, Jan 9, 2020, 9:19 PM Junpeng Shi notifications@github.com wrote:

Hi Shujun, Is there any method to further classify the DNA transposon that name as DNA/DTT, DNA/DTA by EDTA into specific superfamily names such as Harbinger, Mu, AC/DS and others?

Best regards, Junpeng

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/oushujun/EDTA/issues/41?email_source=notifications&email_token=ABNX4NHX7F4QCE7QBRTZNULQ47SNXA5CNFSM4KFBXTQ2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IFHMZLQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NA7SCHB5AI75CMLW4TQ47SNXANCNFSM4KFBXTQQ .

JunpengShi commented 4 years ago

Dear Shujun, thank you for your quick reply. I used EDTA and Repeatmasker to annotate a same genome, then cross-compared the annotation of DNA transposons, and create a table as following: DTC CMC-EnSpm DTM rRNA (or MULE-MuDR?) DTA hAT DTH PIF-Harbinger DTT TcMar-Stowaway

Is this table accurate? Or could you please provide a table between three-letter names and conventional names?

Thanks, Junpeng

oushujun commented 4 years ago

Dear Junpeng,

Please refer to Wicker et al 2007 for naming systems. The EDTA 1.7.1 version also produces a summary table (.TEanno.sum) which is more accurate in terms of naming.

Best, Shujun

JunpengShi commented 4 years ago

Hi Shujun,

Thank you for the reference of naming systems. I previously read this paper but missed the part of these unified Code.

Another question, I used EDTA to annotate the maize Mo17 genome with the following command perl /home/lailab/shijunpeng/software/EDTA/EDTA.pl -genome ./Zm-Mo17-REFERENCE-CAU-1.0.fa -species Maize -step all -overwrite 1 -sensitive 0 -anno 1 -threads 30

I found some error during TIR process: Species: Maize cat: 'chr280.fa': No such file or directory cat: 'chr980.fa': No such file or directory cat: 'chr480.fa': No such file or directory cat: 'chr880.fa': No such file or directory cat: 'chr780.fa': No such file or directory /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprec _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprec _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) Finish finding TIR candidates.

I used the same command to annotate a less complex setaria genome (~420 Mb) and it works well: Sun Jan 12 12:03:19 CST 2020 Identify TIR candidates from scratch.

Species: others /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprec _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprec _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is np_resource = np.dtype([("resource", np.ubyte, 1)]) Finish finding TIR candidates.

I found a similar issue #25 ,and tried to re-install the trf conda install -n EDTA -y trf, but the error still exists.

Do you have any opnion on this error, any possibility that due to multiple threads?

Thanks, Junpeng

JunpengShi commented 4 years ago

I used the version of Extensive de-novo TE Annotator (EDTA) v1.7.1

oushujun commented 4 years ago

Hi @JunpengShi ,

Looks like both runs have the same warning. Did you get results from both runs? I cc @weijiaweijia here for more insights.

Best, Shujun

JunpengShi commented 4 years ago

Hi @oushujun ,

Sorry for these issues. I checked the disk usage and found the partition I used to run EDTA is going to be full.

I have resubmitted this command in an empty partition. I will feedback once it successfully finished or some errors still exist.

Thanks, Junpeng

JunpengShi commented 4 years ago

Hi Shujun,

Some feedbacks. I used EDTA to annotate the maize B73 V4 genome and the same error exists as those when annotating the Mo17 genome.

Sat Jan 18 02:20:09 CST 2020 Start to find TIR candidates.

Sat Jan 18 02:20:09 CST 2020 Identify TIR candidates from scratch.

Species: Maize cat: 'B73V4_ctg13480.fa': No such file or directory cat: 'B73V4_ctg17980.fa': No such file or directory cat: 'B73V4_ctg12580.fa': No such file or directory cat: 'B73V4_ctg24580.fa': No such file or directory cat: 'B73V4_ctg13080.fa': No such file or directory cat: 'B73V4_ctg20580.fa': No such file or directory cat: 'B73V4_ctg5980.fa': No such file or directory cat: 'B73V4_ctg16780.fa': No such file or directory cat: 'B73V4_ctg22680.fa': No such file or directory cat: 'B73V4_ctg25180.fa': No such file or directory cat: 'B73V4_ctg16380.fa': No such file or directory cat: 'B73V4_ctg21280.fa': No such file or directory cat: 'B73V4_ctg17280.fa': No such file or directory cat: 'B73V4_ctg16280.fa': No such file or directory cat: 'B73V4_ctg13180.fa': No such file or directory cat: 'B73V4_ctg16080.fa': No such file or directory cat: 'B73V4_ctg20480.fa': No such file or directory cat: 'B73V4_ctg22880.fa': No such file or directory cat: 'B73V4_ctg11680.fa': No such file or directory cat: 'B73V4_ctg12680.fa': No such file or directory cat: 'B73V4_ctg7680.fa': No such file or directory /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as np_resource = np.dtype([("resource", np.ubyte, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be unders _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be unders _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be unders _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be unders _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be unders _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be unders np_resource = np.dtype([("resource", np.ubyte, 1)])

Error: Error while loading sequenceWarning: The TIR result file has 0 bp!

Sat Jan 18 17:12:48 CST 2020 Start to find Helitron candidates.

I submitted this job in a partition with more than 50 Tb free disk, so it's not owing to the disk usage.

Have you successfully annotated the B73 V4 sequnces with EDTA v1.7.1?

Thanks, Junpeng

oushujun commented 4 years ago

Hi Junpeng,

Looks like your environment is not correctly installed. Please run the installation again. I tested 1.7.1 with the Drosophila genome and it worked well. You may also want to update to 1.7.3.

Best, Shujun

On Sat, Jan 18, 2020, 4:04 AM Junpeng Shi notifications@github.com wrote:

Hi Shujun,

Some feedbacks. I used EDTA to annotate the maize B73 V4 genome and the same error exists as those when annotating the Mo17 genome.

Sat Jan 18 02:20:09 CST 2020 Start to find TIR candidates.

Sat Jan 18 02:20:09 CST 2020 Identify TIR candidates from scratch.

Species: Maize cat: 'B73V4_ctg13480.fa': No such file or directory cat: 'B73V4_ctg17980.fa': No such file or directory cat: 'B73V4_ctg12580.fa': No such file or directory cat: 'B73V4_ctg24580.fa': No such file or directory cat: 'B73V4_ctg13080.fa': No such file or directory cat: 'B73V4_ctg20580.fa': No such file or directory cat: 'B73V4_ctg5980.fa': No such file or directory cat: 'B73V4_ctg16780.fa': No such file or directory cat: 'B73V4_ctg22680.fa': No such file or directory cat: 'B73V4_ctg25180.fa': No such file or directory cat: 'B73V4_ctg16380.fa': No such file or directory cat: 'B73V4_ctg21280.fa': No such file or directory cat: 'B73V4_ctg17280.fa': No such file or directory cat: 'B73V4_ctg16280.fa': No such file or directory cat: 'B73V4_ctg13180.fa': No such file or directory cat: 'B73V4_ctg16080.fa': No such file or directory cat: 'B73V4_ctg20480.fa': No such file or directory cat: 'B73V4_ctg22880.fa': No such file or directory cat: 'B73V4_ctg11680.fa': No such file or directory cat: 'B73V4_ctg12680.fa': No such file or directory cat: 'B73V4_ctg7680.fa': No such file or directory /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as np_resource = np.dtype([("resource", np.ubyte, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be unders _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be unders _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be unders _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be unders _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be unders _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/lailab/anaconda3/envs/EDTA/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be unders np_resource = np.dtype([("resource", np.ubyte, 1)])

Error: Error while loading sequenceWarning: The TIR result file has 0 bp!

Sat Jan 18 17:12:48 CST 2020 Start to find Helitron candidates.

I submitted this job in a partition with more than 50 Tb free disk, so it's not owing to the disk usage.

Have you successfully annotated the B73 V4 sequnces with EDTA v1.7.1?

Thanks, Junpeng

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/oushujun/EDTA/issues/41?email_source=notifications&email_token=ABNX4NGOZZ5RLZECSBSWE6LQ6LH3FA5CNFSM4KFBXTQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJJU3LI#issuecomment-575884717, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNX4NHCB2QF4VRVIREWUH3Q6LH3FANCNFSM4KFBXTQQ .

oushujun commented 4 years ago

Hi Junpeng,

I added a seq ID check to alleviate the complex seq ID issue. Please update to v1.7.4 and test again.

Shujun

oushujun commented 4 years ago

Close due to no activity. Please reopen it if the issue persists.