oushujun / EDTA

Extensive de-novo TE Annotator
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1905-y
GNU General Public License v3.0
346 stars 73 forks source link

error in Identify TIR candidates #516

Open qfh20m20m opened 6 days ago

qfh20m20m commented 6 days ago

Hi Prof. Ou,

My EDTA was wrong in the TIR prediction, and the running log is as follows:

#########################################################
##### Extensive de-novo TE Annotator (EDTA) v2.2.2  #####
##### Shujun Ou (shujun.ou.1@gmail.com)             #####
#########################################################

Parameters: --genome ../2nd/Hedychium_hic.post.FINAL.fa --species others --step all -t 50 --sensitive 1

Wed Oct 30 17:29:20 HKT 2024    Dependency checking:
    All passed!

Wed Oct 30 17:29:54 HKT 2024    Obtain raw TE libraries using various structure-based programs: 
Wed Oct 30 17:29:54 HKT 2024    EDTA_raw: Check dependencies, prepare working directories.

Wed Oct 30 17:31:24 HKT 2024    Start to find LTR candidates.

Wed Oct 30 17:31:24 HKT 2024    Identify LTR retrotransposon candidates from scratch.

Wed Oct 30 19:00:00 HKT 2024    Finish finding LTR candidates.

Wed Oct 30 19:00:00 HKT 2024    Start to find SINE candidates.

Wed Oct 30 21:01:07 HKT 2024    Finish finding SINE candidates.

Wed Oct 30 21:01:07 HKT 2024    Start to find LINE candidates.

Wed Oct 30 21:01:07 HKT 2024    Identify LINE retrotransposon candidates from scratch.

Thu Oct 31 23:48:25 HKT 2024    Finish finding LINE candidates.

Thu Oct 31 23:48:25 HKT 2024    Start to find TIR candidates.

Thu Oct 31 23:48:25 HKT 2024    Identify TIR candidates from scratch.

Species: others

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: 
libgomp: Thread creation failed: Resource temporarily unavailable
Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable
/bin/sh: fork: retry: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: 
libgomp: Thread creation failed: Resource temporarily unavailable
Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable
/bin/sh: fork: retry: No child processes
/bin/sh: fork: retry: No child processes

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable
/bin/sh: fork: retry: No child processes

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable
/bin/sh: fork: retry: No child processes

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable
/bin/sh: fork: retry: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable
/bin/sh: fork: retry: No child processes

libgomp: Thread creation failed: Resource temporarily unavailable
/bin/sh: fork: retry: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable
/bin/sh: fork: retry: No child processes

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: 
libgomp: Thread creation failed: Resource temporarily unavailable
Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable
/bin/sh: fork: retry: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable

libgomp: Thread creation failed: Resource temporarily unavailable
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/lib/python3.12/multiprocessing/pool.py", line 51, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/run_GRF.py", line 36, in GRF_mp
    GRF(genome_file_name, genome_name, cpu_cores, TIR_length, GRF_path)
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/run_GRF.py", line 29, in GRF
    subprocess.Popen(grf + shell_filter, shell=True).wait()
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/lib/python3.12/subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/lib/python3.12/subprocess.py", line 1885, in _execute_child
    self.pid = _fork_exec(
               ^^^^^^^^^^^
BlockingIOError: [Errno 11] Resource temporarily unavailable
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/share/TIR-Learner3/TIR-Learner.py", line 95, in <module>
    TIRLearner_instance = TIRLearner(genome_file, genome_name, species, TIR_length,
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/main.py", line 91, in __init__
    self.execute()
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/main.py", line 131, in execute
    self.execute_M4()
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/main.py", line 598, in execute_M4
    self["GRF"] = run_GRF.execute(self)
                  ^^^^^^^^^^^^^^^^^^^^^
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/run_GRF.py", line 182, in execute
    return run_GRF_py_para(genome_file, genome_name, TIR_length, cpu_cores, para_mode, flag_debug, GRF_path,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/run_GRF.py", line 82, in run_GRF_py_para
    pool.starmap(GRF_mp, mp_args_list)
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/lib/python3.12/multiprocessing/pool.py", line 375, in starmap
    return self._map_async(func, iterable, starmapstar, chunksize).get()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/lib/python3.12/multiprocessing/pool.py", line 774, in get
    raise self._value
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/lib/python3.12/multiprocessing/pool.py", line 51, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
      ^^^^^^^^^^^^^^^^^
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/run_GRF.py", line 36, in GRF_mp
    GRF(genome_file_name, genome_name, cpu_cores, TIR_length, GRF_path)
      ^^^^^^^^^^^^^^^^^
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/run_GRF.py", line 29, in GRF
    subprocess.Popen(grf + shell_filter, shell=True).wait()
      ^^^^^^^^^^^^^^^^^
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/lib/python3.12/subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
^^^^^^^^^^^^^^^
  File "/nfs_fs/nfs1/pengxiaochang/miniforge3/envs/EDTA2.2/lib/python3.12/subprocess.py", line 1885, in _execute_child
    self.pid = _fork_exec(
^^^^^^^
BlockingIOError: [Errno 11] Resource temporarily unavailable
Can't open ./TIR-Learner-Result/TIR-Learner_FinalAnn.fa: No such file or directory at /nfs_fs/nfs1/pengxiaochang/software/EDTA/bin/rename_tirlearner.pl line 19.
Warning: LOC list Hedychium_hic.post.FINAL.fa.mod.TIR.ext30.list is empty.

Error: Error while loading sequence
Filter sequence based on TEsorter classifications. Unclassified sequences will also be output to the clean file.
    Usage: perl cleanup_misclas.pl sequence.fa.rexdb.cls.tsv
    Author: Shujun Ou (shujun.ou.1@gmail.com) 10/11/2019

mv: cannot stat 'Hedychium_hic.post.FINAL.fa.mod.TIR.ext30.fa.pass.fa.dusted.cln.cln': No such file or directory
cp: cannot stat 'Hedychium_hic.post.FINAL.fa.mod.TIR.ext30.fa.pass.fa.dusted.cln.cln.list': No such file or directory
cp: cannot stat 'Hedychium_hic.post.FINAL.fa.mod.TIR.intact.raw.fa.anno.list': No such file or directory
Can't open ./TIR-Learner-Result/TIR-Learner_FinalAnn.gff3: No such file or directory.
ERROR: No such file or directory at /nfs_fs/nfs1/pengxiaochang/software/EDTA/bin/output_by_list.pl line 39.
Error: TIR results not found!

ERROR: Raw TIR results not found in Hedychium_hic.post.FINAL.fa.mod.EDTA.raw/Hedychium_hic.post.FINAL.fa.mod.TIR.intact.raw.fa
    If you believe the program is working properly, this may be caused by the lack of intact TIRs in your genome. Consider to use the --force 1 parameter to overwrite this check

Best wish, Peng

lourdesalo commented 3 days ago

Hi! I also have a similar issue

Parameters: --genome /mnt/home/lopezj38/ref_genome/GCF_000001735.4_TAIR10.1_genomic.fna --cds /mnt/home/lopezj38/ref_genome/GCF_000001735.4_TAIR10.1_TAIR_cds_from_genomic.fna --curatedlib /mnt/home/lopezj38/ref_genome/athrep.updated.nonredun.fasta --exclude /mnt/home/lopezj38/ref_genome/plasmid.bed --overwrite 1 --sensitive 1 --anno 1 --evaluate 1 --u 1.5e-8 --threads 28

Tue Oct 29 02:03:54 PM EDT 2024 Dependency checking:
    All passed!

    A custom library /mnt/home/lopezj38/ref_genome/athrep.updated.nonredun.fasta is provided via --curatedlib. Please make sure this is a manually curated library but not machine generated.

    A CDS file /mnt/home/lopezj38/ref_genome/GCF_000001735.4_TAIR10.1_TAIR_cds_from_genomic.fna is provided via --cds. Please make sure this is the DNA sequence of coding regions only.

    A BED file is provided via --exclude. Regions specified by this file will be excluded from TE annotation and masking.

Tue Oct 29 02:03:58 PM EDT 2024 Obtain raw TE libraries using various structure-based programs: 
Tue Oct 29 02:03:58 PM EDT 2024 EDTA_raw: Check dependencies, prepare working directories.

Tue Oct 29 02:04:42 PM EDT 2024 Start to find LTR candidates.

Tue Oct 29 02:04:42 PM EDT 2024 Identify LTR retrotransposon candidates from scratch.

Tue Oct 29 02:12:08 PM EDT 2024 Finish finding LTR candidates.

Tue Oct 29 02:12:08 PM EDT 2024 Start to find SINE candidates.

Tue Oct 29 02:29:54 PM EDT 2024 Finish finding SINE candidates.

Tue Oct 29 02:29:54 PM EDT 2024 Start to find LINE candidates.

Tue Oct 29 02:29:54 PM EDT 2024 Identify LINE retrotransposon candidates from scratch.

Tue Oct 29 06:58:16 PM EDT 2024 Finish finding LINE candidates.

Tue Oct 29 06:58:16 PM EDT 2024 Start to find TIR candidates.

Tue Oct 29 06:58:16 PM EDT 2024 Identify TIR candidates from scratch.

Species: others
Traceback (most recent call last):
  File "/mnt/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/lib/python3.9/site-packages/swifter/swifter.py", line 419, in apply
    tmp_df = func(sample, *args, **kwds)
  File "/mnt/gs21/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/get_fasta_sequence.py", line 23, in <lambda>
    df["end"] = df.swifter.progress_bar(flag_verbose).apply(lambda x: min(x["end"], fasta_len_dict[x["seqid"]]), axis=1)
TypeError: unhashable type: 'Series'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/gs21/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/share/TIR-Learner3/TIR-Learner.py", line 95, in <module>
    TIRLearner_instance = TIRLearner(genome_file, genome_name, species, TIR_length,
  File "/mnt/gs21/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/main.py", line 91, in __init__
    self.execute()
  File "/mnt/gs21/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/main.py", line 131, in execute
    self.execute_M4()
  File "/mnt/gs21/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/main.py", line 632, in execute_M4
    self["base"] = get_fasta_sequence.execute(self)
  File "/mnt/gs21/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/get_fasta_sequence.py", line 67, in execute
    df = get_start_end(TIRLearner_instance.genome_file_path, TIRLearner_instance["base"],
  File "/mnt/gs21/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/get_fasta_sequence.py", line 23, in get_start_end
    df["end"] = df.swifter.progress_bar(flag_verbose).apply(lambda x: min(x["end"], fasta_len_dict[x["seqid"]]), axis=1)
  File "/mnt/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/lib/python3.9/site-packages/swifter/swifter.py", line 428, in apply
    timed = timeit.timeit(wrapped, number=N_REPEATS)
  File "/mnt/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/lib/python3.9/timeit.py", line 233, in timeit
    return Timer(stmt, setup, timer, globals).timeit(number)
  File "/mnt/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/lib/python3.9/timeit.py", line 177, in timeit
    timing = self.inner(it, self.timer)
  File "<timeit-src>", line 6, in inner
  File "/mnt/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/lib/python3.9/site-packages/swifter/swifter.py", line 337, in wrapped
    self._obj.iloc[self._SAMPLE_INDEX].apply(
  File "/mnt/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/lib/python3.9/site-packages/pandas/core/frame.py", line 10374, in apply
    return op.apply().__finalize__(self, method="apply")
  File "/mnt/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/lib/python3.9/site-packages/pandas/core/apply.py", line 916, in apply
    return self.apply_standard()
  File "/mnt/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/lib/python3.9/site-packages/pandas/core/apply.py", line 1063, in apply_standard
    results, res_index = self.apply_series_generator()
  File "/mnt/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/lib/python3.9/site-packages/pandas/core/apply.py", line 1081, in apply_series_generator
    results[i] = self.func(v, *self.args, **self.kwargs)
  File "/mnt/gs21/scratch/lopezj38/bin/miniforge3/envs/EDTA2.2/share/TIR-Learner3/bin/get_fasta_sequence.py", line 23, in <lambda>
    df["end"] = df.swifter.progress_bar(flag_verbose).apply(lambda x: min(x["end"], fasta_len_dict[x["seqid"]]), axis=1)
KeyError: 'NC_003070.9_split_1of7'
Can't open ./TIR-Learner-Result/TIR-Learner_FinalAnn.fa: No such file or directory at /mnt/gs21/scratch/lopezj38/TE_annotation/EDTA/bin/rename_tirlearner.pl line 19.
Warning: LOC list GCF_000001735.4_TAIR10.1_genomic.fna.mod.TIR.ext30.list is empty.

Error: Error while loading sequence
Filter sequence based on TEsorter classifications. Unclassified sequences will also be output to the clean file.
    Usage: perl cleanup_misclas.pl sequence.fa.rexdb.cls.tsv
    Author: Shujun Ou (shujun.ou.1@gmail.com) 10/11/2019

mv: cannot stat 'GCF_000001735.4_TAIR10.1_genomic.fna.mod.TIR.ext30.fa.pass.fa.dusted.cln.cln': No such file or directory
cp: cannot stat 'GCF_000001735.4_TAIR10.1_genomic.fna.mod.TIR.ext30.fa.pass.fa.dusted.cln.cln.list': No such file or directory
cp: cannot stat 'GCF_000001735.4_TAIR10.1_genomic.fna.mod.TIR.intact.raw.fa.anno.list': No such file or directory
Can't open ./TIR-Learner-Result/TIR-Learner_FinalAnn.gff3: No such file or directory.
ERROR: No such file or directory at /mnt/gs21/scratch/lopezj38/TE_annotation/EDTA/bin/output_by_list.pl line 39.
Error: TIR results not found!

ERROR: Raw TIR results not found in GCF_000001735.4_TAIR10.1_genomic.fna.mod.EDTA.raw/GCF_000001735.4_TAIR10.1_genomic.fna.mod.TIR.intact.raw.fa
    If you believe the program is working properly, this may be caused by the lack of intact TIRs in your genome. Consider to use the --force 1 parameter to overwrite this check

Thank you for your help!