CSU-KangHu / HiTE

High-precision TE Annotator
GNU General Public License v3.0
42 stars 1 forks source link

BLAST Database error #10

Closed jwli-code closed 3 days ago

jwli-code commented 1 month ago

An error occurred when running the sif file. --annotate 1 --plant 1 --intact_anno 1

2024-07-24 20:57:48,357 - main.py[line:316] - INFO:
-------------------------------------------------------------------------------------------
Copyright (C) 2022 Kang Hu ( kanghu@csu.edu.cn )
Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and
Engineering, Central South University, Changsha 410083, P.R. China.
-------------------------------------------------------------------------------------------
2024-07-24 20:57:48,357 - main.py[line:322] - INFO:
Parameters configuration
====================================System settings========================================
  [Setting] Reference sequences / assemblies path = [ /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta ]
  [Setting] Is remove nested TE = [ 1 ] Default( 1 )
  [Setting] Is getting domain = [ 0 ] Default( 0 )
  [Setting] The neutral mutation rate (per bp per ya) = [ 1.3e-08 ] Default( 1.3e-08 )
  [Setting] Threads = [ 15 ]  Default( 64 )
  [Setting] The chunk size of large genome = [ 400 ] MB Default( 400 ) MB
  [Setting] Is plant genome = [ 1 ]  Default( 1 )
  [Setting] recover = [ 0 ]  Default( 0 )
  [Setting] annotate = [ 1 ]  Default( 0 )
  [Setting] intact_anno = [ 1 ]  Default( 0 )
  [Setting] search_struct = [ 1 ] Default( 1 )
  [Setting] BM_RM2 = [ 0 ]  Default( 0 )
  [Setting] BM_EDTA = [ 0 ]  Default( 0 )
  [Setting] BM_HiTE = [ 0 ]  Default( 0 )
  [Setting] EDTA_home = []
  [Setting] coverage_threshold = [ 0.95 ]  Default( 0.95 )
  [Setting] skip_HiTE = [ 0 ]  Default( 0 )
  [Setting] is_prev_mask = [ 1 ]  Default( 1 )
  [Setting] is_denovo_nonltr = [ 1 ]  Default( 1 )
  [Setting] use_NeuralTE = [ 1 ]  Default( 1 )
  [Setting] is_wicker = [ 0 ]  Default( 0 )
  [Setting] debug = [ 0 ]  Default( 0 )
  [Setting] Output Directory = [/public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A]
  [Setting] Fixed extend bases threshold = [ 1000 ] Default( 1000 )
  [Setting] Flanking length of TE = [ 50 ]  Default( 50 )
  [Setting] Cutoff of the repeat regarded as tandem sequence = [ 0.5 ] Default( 0.5 )
  [Setting] The length of genome segments = [ 100000 ]  Default( 100000 )
2024-07-24 20:57:48,357 - main.py[line:377] - INFO: Start step0: Structural Based LTR Searching
2024-07-24 20:57:48,357 - main.py[line:389] - INFO: cd /HiTE/module && python3 /HiTE/module/judge_LTR_transposons.py  -g /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta --ltrharvest_home /HiTE/bin/LTR_HARVEST_parallel --ltrfinder_home /HiTE/bin/LTR_FINDER_parallel-master -t 15 --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --recover 0 --miu 1.3e-08 --use_NeuralTE 1 --is_wicker 0 --NeuralTE_home /HiTE/bin/NeuralTE --TEClass_home /HiTE/classification
2024-07-24 20:57:50,631 - /HiTE/module/judge_LTR_transposons.py[line:153] - INFO: Start step0.1: Running LTR_harvest_parallel and LTR_finder_parallel
2024-07-24 20:57:50,633 - /HiTE/module/Util.py[line:561] - DEBUG: cd /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A && perl /HiTE/bin/LTR_HARVEST_parallel/LTR_HARVEST_parallel -seq /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/genome.rename.fa -threads 15 > /dev/null 2>&1
2024-07-24 22:27:05,781 - /HiTE/module/Util.py[line:569] - DEBUG: cd /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A && perl /HiTE/bin/LTR_FINDER_parallel-master/LTR_FINDER_parallel -harvest_out -seq /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/genome.rename.fa -threads 15 > /dev/null 2>&1
2024-07-24 23:05:01,580 - /HiTE/module/judge_LTR_transposons.py[line:158] - INFO: Running time of step0.1: 7630.949 s
2024-07-24 23:05:01,632 - /HiTE/module/judge_LTR_transposons.py[line:169] - INFO: Start step0.2: run LTR_retriever to get confident LTR
2024-07-24 23:05:01,632 - /HiTE/module/Util.py[line:682] - DEBUG: start LTR_retriever detection...
2024-07-24 23:05:01,633 - /HiTE/module/Util.py[line:685] - DEBUG: cd /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A && LTR_retriever -genome /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/genome.rename.fa -inharvest /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/genome_all.fa.rawLTR.scn -noanno -threads 15 -u 1.3e-08
2024-07-25 00:09:10,441 - /HiTE/module/Util.py[line:689] - DEBUG: LTR_retriever running time: 3848.808 s
2024-07-25 00:09:10,442 - /HiTE/module/judge_LTR_transposons.py[line:173] - INFO: Running time of step0.2: 3848.809 s
2024-07-25 00:09:12,294 - /HiTE/module/judge_LTR_transposons.py[line:224] - DEBUG: python /HiTE/bin/NeuralTE/src/Classifier.py --data /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/intact_LTR.fa --use_TSD 0 --model_path /HiTE/bin/NeuralTE/models/NeuralTE_model.h5 --outdir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/NeuralTE --thread 15 --is_wicker 0
2024-07-25 00:21:44,700 - main.py[line:393] - INFO: Running time of step0: 12236.34 s
2024-07-25 00:21:44,719 - main.py[line:401] - INFO: Start step1: homology-based other TE searching
2024-07-25 00:21:44,719 - main.py[line:407] - INFO: cd /HiTE/module && python3 /HiTE/module/judge_Other_transposons.py  -r /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta -t 15 --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --library_dir /HiTE/library --recover 0
2024-07-25 00:22:20,228 - main.py[line:411] - INFO: Running time of step1: 35.50931 s
2024-07-25 00:22:20,228 - main.py[line:417] - INFO: Start step2.0: Splitting genome assembly into chunks
2024-07-25 00:22:20,228 - main.py[line:421] - INFO: cd /HiTE/module && python3 /HiTE/module/split_genome_chunks.py -g /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --chrom_seg_length 100000 --chunk_size 400
2024-07-25 00:22:20,457 - /HiTE/module/split_genome_chunks.py[line:32] - INFO: Start Splitting Reference into chunks
2024-07-25 00:22:37,525 - main.py[line:425] - INFO: Running time of step2.0: 17.29680 s
2024-07-25 00:22:37,538 - main.py[line:446] - INFO: Current chunk: 1
2024-07-25 00:22:37,538 - main.py[line:453] - INFO: Start 2.1: Coarse-grained boundary mapping
2024-07-25 00:22:37,538 - main.py[line:465] - INFO: cd /HiTE/module && python3 /HiTE/module/coarse_boundary.py  -g /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/genome.cut1.fa --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --prev_TE /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/prev_TE.fa --fixed_extend_base_threshold 1000 --max_repeat_len 30000 --thread 15 --flanking_len 50 --tandem_region_cutoff 0.5 --ref_index 1 -r /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta --recover 0 --is_prev_mask 1 --debug 0
2024-07-25 00:34:34,727 - main.py[line:469] - INFO: Running time of step2.1: 717.1889 s
2024-07-25 00:34:34,727 - main.py[line:477] - INFO: Start step2.2: determine fine-grained TIR
2024-07-25 00:34:34,728 - main.py[line:491] - DEBUG: cd /HiTE/module && python3 /HiTE/module/judge_TIR_transposons.py -g /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/genome.cut1.fa --seqs /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/longest_repeats_1.flanked.fa -t 15 --TRsearch_dir /HiTE/tools --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --tandem_region_cutoff 0.5 --ref_index 1 --subset_script_path /HiTE/tools/ready_for_MSA.sh --plant 1 --flanking_len 50 --recover 0 --debug 0 -r /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta --split_ref_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/ref_chr
2024-07-25 00:34:34,962 - /HiTE/module/judge_TIR_transposons.py[line:140] - INFO: ------get TIR+TSD in copies of candidate TIR
2024-07-25 00:35:47,035 - /HiTE/module/judge_TIR_transposons.py[line:147] - INFO: Running time of getting TSD in copies of candidate TIR: 72.07257 s
2024-07-25 00:35:47,035 - /HiTE/module/judge_TIR_transposons.py[line:152] - INFO: ------clustering candidate TIR
2024-07-25 00:36:03,365 - /HiTE/module/judge_TIR_transposons.py[line:159] - INFO: Running time of clustering candidate TIR: 16.33010 s
2024-07-25 00:36:03,366 - /HiTE/module/judge_TIR_transposons.py[line:15] - INFO: determine true TIR
2024-07-25 00:36:03,366 - /HiTE/module/judge_TIR_transposons.py[line:16] - INFO: ------flank TIR copy and see if the flanking regions are repeated
2024-07-25 00:36:03,366 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of tir copies
2024-07-25 00:51:20,614 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  tir copies: 917.2477 s
2024-07-25 00:51:20,686 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of tir copies
2024-07-25 00:54:39,798 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  tir copies: 199.1115 s
2024-07-25 00:54:39,848 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of tir copies
2024-07-25 00:57:05,406 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  tir copies: 145.5576 s
2024-07-25 00:57:05,485 - /HiTE/module/judge_TIR_transposons.py[line:68] - INFO: Running time of flanking TIR copy and see if the flanking regions are repeated: 1262.118 s
2024-07-25 00:57:05,541 - main.py[line:495] - INFO: Running time of step2.2: 1350.813 s
2024-07-25 00:57:05,541 - main.py[line:502] - INFO: Start step2.3: determine fine-grained Helitron
2024-07-25 00:57:05,541 - main.py[line:511] - INFO: cd /HiTE/module && python3 /HiTE/module/judge_Helitron_transposons.py --seqs /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/longest_repeats_1.flanked.fa -r /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta -t 15 --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --HSDIR /HiTE/bin/HelitronScanner/TrainingSet --HSJAR /HiTE/bin/HelitronScanner/HelitronScanner.jar --sh_dir /HiTE/bin --EAHelitron /HiTE/bin/EAHelitron-master --subset_script_path /HiTE/tools/ready_for_MSA.sh --ref_index 1 --flanking_len 50 --recover 0 --debug 0 --split_ref_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/ref_chr
2024-07-25 01:12:13,078 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of helitron copies
2024-07-25 01:45:42,064 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  helitron copies: 2008.985 s
2024-07-25 01:45:42,204 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of helitron copies
2024-07-25 01:49:07,397 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  helitron copies: 205.1926 s
2024-07-25 01:49:07,428 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of helitron copies
2024-07-25 01:52:24,333 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  helitron copies: 196.9046 s
2024-07-25 01:52:25,961 - main.py[line:515] - INFO: Running time of step2.3: 3320.419 s
2024-07-25 01:52:25,964 - main.py[line:522] - INFO: Start step2.4: determine fine-grained Non-LTR
2024-07-25 01:52:25,964 - main.py[line:535] - DEBUG: cd /HiTE/module && python3 /HiTE/module/judge_Non_LTR_transposons.py --seqs /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/longest_repeats_1.flanked.fa -t 15 --subset_script_path /HiTE/tools/ready_for_MSA.sh --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --library_dir /HiTE/library --recover 0 --plant 1 --debug 0 --flanking_len 50 --ref_index 1 --is_denovo_nonltr 1 -r /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta
2024-07-25 01:52:27,777 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of non_ltr copies
2024-07-25 01:52:51,777 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  non_ltr copies: 23.99909 s
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
2024-07-25 01:52:52,405 - main.py[line:539] - INFO: Running time of step2.4: 26.44109 s
2024-07-25 01:52:52,406 - main.py[line:446] - INFO: Current chunk: 0
2024-07-25 01:52:52,406 - main.py[line:453] - INFO: Start 2.1: Coarse-grained boundary mapping
2024-07-25 01:52:52,406 - main.py[line:465] - INFO: cd /HiTE/module && python3 /HiTE/module/coarse_boundary.py  -g /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/genome.cut0.fa --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --prev_TE /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/prev_TE.fa --fixed_extend_base_threshold 1000 --max_repeat_len 30000 --thread 15 --flanking_len 50 --tandem_region_cutoff 0.5 --ref_index 0 -r /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta --recover 0 --is_prev_mask 1 --debug 0
2024-07-25 02:44:06,561 - main.py[line:469] - INFO: Running time of step2.1: 3074.155 s
2024-07-25 02:44:06,604 - main.py[line:477] - INFO: Start step2.2: determine fine-grained TIR
2024-07-25 02:44:06,604 - main.py[line:491] - DEBUG: cd /HiTE/module && python3 /HiTE/module/judge_TIR_transposons.py -g /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/genome.cut0.fa --seqs /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/longest_repeats_0.flanked.fa -t 15 --TRsearch_dir /HiTE/tools --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --tandem_region_cutoff 0.5 --ref_index 0 --subset_script_path /HiTE/tools/ready_for_MSA.sh --plant 1 --flanking_len 50 --recover 0 --debug 0 -r /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta --split_ref_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/ref_chr
2024-07-25 02:44:06,843 - /HiTE/module/judge_TIR_transposons.py[line:140] - INFO: ------get TIR+TSD in copies of candidate TIR
2024-07-25 02:51:01,014 - /HiTE/module/judge_TIR_transposons.py[line:147] - INFO: Running time of getting TSD in copies of candidate TIR: 414.1512 s
2024-07-25 02:51:01,041 - /HiTE/module/judge_TIR_transposons.py[line:152] - INFO: ------clustering candidate TIR
2024-07-25 02:52:34,054 - /HiTE/module/judge_TIR_transposons.py[line:159] - INFO: Running time of clustering candidate TIR: 93.01223 s
2024-07-25 02:52:34,054 - /HiTE/module/judge_TIR_transposons.py[line:15] - INFO: determine true TIR
2024-07-25 02:52:34,054 - /HiTE/module/judge_TIR_transposons.py[line:16] - INFO: ------flank TIR copy and see if the flanking regions are repeated
2024-07-25 02:52:34,054 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of tir copies
2024-07-25 03:06:08,715 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  tir copies: 814.6606 s
2024-07-25 03:06:08,849 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of tir copies
2024-07-25 03:09:43,779 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  tir copies: 214.9297 s
2024-07-25 03:09:43,881 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of tir copies
2024-07-25 03:12:17,889 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  tir copies: 154.0084 s
2024-07-25 03:12:17,995 - /HiTE/module/judge_TIR_transposons.py[line:68] - INFO: Running time of flanking TIR copy and see if the flanking regions are repeated: 1183.940 s
2024-07-25 03:12:18,082 - main.py[line:495] - INFO: Running time of step2.2: 1691.477 s
2024-07-25 03:12:18,098 - main.py[line:502] - INFO: Start step2.3: determine fine-grained Helitron
2024-07-25 03:12:18,099 - main.py[line:511] - INFO: cd /HiTE/module && python3 /HiTE/module/judge_Helitron_transposons.py --seqs /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/longest_repeats_0.flanked.fa -r /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta -t 15 --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --HSDIR /HiTE/bin/HelitronScanner/TrainingSet --HSJAR /HiTE/bin/HelitronScanner/HelitronScanner.jar --sh_dir /HiTE/bin --EAHelitron /HiTE/bin/EAHelitron-master --subset_script_path /HiTE/tools/ready_for_MSA.sh --ref_index 0 --flanking_len 50 --recover 0 --debug 0 --split_ref_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/ref_chr
2024-07-25 04:25:58,086 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of helitron copies
2024-07-25 04:44:17,752 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  helitron copies: 1099.648 s
2024-07-25 04:44:17,945 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of helitron copies
2024-07-25 04:46:28,939 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  helitron copies: 130.9937 s
2024-07-25 04:46:28,995 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of helitron copies
2024-07-25 04:48:27,420 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  helitron copies: 118.4242 s
2024-07-25 04:48:35,542 - main.py[line:515] - INFO: Running time of step2.3: 5777.443 s
2024-07-25 04:48:35,543 - main.py[line:522] - INFO: Start step2.4: determine fine-grained Non-LTR
2024-07-25 04:48:35,543 - main.py[line:535] - DEBUG: cd /HiTE/module && python3 /HiTE/module/judge_Non_LTR_transposons.py --seqs /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/longest_repeats_0.flanked.fa -t 15 --subset_script_path /HiTE/tools/ready_for_MSA.sh --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --library_dir /HiTE/library --recover 0 --plant 1 --debug 0 --flanking_len 50 --ref_index 0 --is_denovo_nonltr 1 -r /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta
2024-07-25 04:48:47,409 - /HiTE/module/Util.py[line:6912] - INFO: ------Determination of homology in regions outside the boundaries of non_ltr copies
2024-07-25 04:51:08,043 - /HiTE/module/Util.py[line:7030] - INFO: Running time of determination of homology in regions outside the boundaries of  non_ltr copies: 140.6237 s
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
BLAST Database error: No alias or index file found for protein database [/HiTE/library/LINEPeps.lib] in search path [/HiTE/module::]
2024-07-25 04:51:08,824 - main.py[line:539] - INFO: Running time of step2.4: 153.2808 s
2024-07-25 04:51:08,949 - main.py[line:559] - INFO: Start step3: generate non-redundant library
2024-07-25 04:51:08,950 - main.py[line:572] - INFO: cd /HiTE/module && python3 /HiTE/module/get_nonRedundant_lib.py --confident_ltr_cut /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_ltr_cut.fa --confident_tir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_tir_merge.fa --confident_helitron /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_helitron_merge.fa --confident_non_ltr /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_non_ltr_merge.fa --confident_other /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_other.fa -t 15 --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --test_home /HiTE/module --use_NeuralTE 1 --is_wicker 0 --NeuralTE_home /HiTE/bin/NeuralTE --TEClass_home /HiTE/classification --domain 0 --protein_path /HiTE/library/RepeatPeps.lib
2024-07-25 04:51:43,968 - /HiTE/module/get_nonRedundant_lib.py[line:123] - DEBUG: python /HiTE/bin/NeuralTE/src/Classifier.py --data /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/TE_merge_tmp.fa --genome /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/genome.rename.fa --use_TSD 1 --model_path /HiTE/bin/NeuralTE/models/NeuralTE-TSDs_model.h5 --outdir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/NeuralTE --thread 15 --is_wicker 0
2024-07-25 04:58:23,176 - main.py[line:576] - INFO: Running time of step3: 434.2266 s
2024-07-25 04:58:23,176 - main.py[line:580] - INFO: Start step4: get full-length TE annotation
2024-07-25 04:58:23,176 - main.py[line:599] - INFO: cd /HiTE/module && python3 /HiTE/module/get_full_length_annotation.py -t 15 --ltr_list /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/genome.rename.fa.pass.list --tir_lib /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_tir.fa --helitron_lib /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_helitron.fa --nonltr_lib /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_non_ltr.fa --other_lib /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_other.fa --chr_name_map /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/chr_name.map -r /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta --module_home /HiTE/module --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --TRsearch_dir /HiTE/tools --search_struct 1
2024-07-25 04:58:23,454 - /HiTE/module/get_full_length_annotation.py[line:68] - DEBUG: perl /HiTE/module/generate_gff_for_ltr.pl /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/genome.rename.fa.pass.list /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/chr_name.map
2024-07-25 04:58:23,664 - /HiTE/module/get_full_length_annotation.py[line:77] - DEBUG: cat /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_tir.fa /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_helitron.fa /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_non_ltr.fa /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_other.fa > /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/TE_tmp.fa
2024-07-25 05:59:29,829 - main.py[line:604] - INFO: Running time of step4: 3666.653 s
2024-07-25 05:59:29,830 - main.py[line:611] - INFO: Start step5: annotate genome
2024-07-25 05:59:29,830 - main.py[line:618] - INFO: cd /HiTE/module && python3 /HiTE/module/annotate_genome.py -t 15 --classified_TE_consensus /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_TE.cons.fa --annotate 1 -r /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A
2024-07-25 05:59:30,082 - /HiTE/module/annotate_genome.py[line:43] - DEBUG: cd /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A && RepeatMasker -e ncbi -pa 15 -gff -lib /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_TE.cons.fa -cutoff 225 /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta
2024-07-25 08:46:06,209 - /HiTE/module/annotate_genome.py[line:49] - DEBUG: mv /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta.out /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/HiTE.out && mv /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta.tbl /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/HiTE.tbl && mv /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta.out.gff /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/HiTE.gff
2024-07-25 08:46:06,251 - main.py[line:623] - INFO: Running time of step5: 9996.421 s
2024-07-25 08:46:06,252 - main.py[line:626] - INFO: Start step6: Start conduct benchmarking of RepeatModeler2, EDTA, and HiTE
2024-07-25 08:46:06,252 - main.py[line:639] - INFO: cd /HiTE/module && python3 /HiTE/module/benchmarking.py --tmp_output_dir /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A --BM_RM2 0 --BM_EDTA 0 --BM_HiTE 0 --coverage_threshold 0.95 -t 15 --lib_module /HiTE/library --TE_lib /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/confident_TE.cons.fa --rm2_script /HiTE/bin/get_family_summary_paper.sh --rm2_strict_script /HiTE/bin/get_family_summary_paper_0.99.sh -r /public/home/jwli1/data1/Brasscia_2024/02.TE/HiTE/BP106A/BP106A.chrplus.genome.fasta --species test
2024-07-25 08:46:09,123 - /HiTE/module/benchmarking.py[line:131] - DEBUG: Skip benchmarking of RepeatModeler2
2024-07-25 08:46:09,124 - /HiTE/module/benchmarking.py[line:168] - DEBUG: Skip benchmarking of EDTA
2024-07-25 08:46:09,124 - /HiTE/module/benchmarking.py[line:184] - DEBUG: Skip benchmarking of HiTE
2024-07-25 08:46:09,155 - main.py[line:643] - INFO: Running time of step6: 2.902936 s
2024-07-25 08:46:56,121 - main.py[line:653] - INFO: Running time of the whole pipeline: 42547.76 s
jwli-code commented 1 month ago

Whether the software generates an index and then deletes it. An error occurs in the subsequent process

CSU-KangHu commented 1 month ago

Hi @jwli-code,

Could you please provide your complete run command? I just tested the SIF file and encountered no errors.

Best regards, Kang

jwli-code commented 1 month ago

singularity run -B /public/home/00.data:/public/home/00.data -B /public/home/HiTE:/public/home/HiTE --pwd /HiTE /public/home/HiTE/HiTE.sif python main.py --genome /public/home/00.data/${line}.chrplus.genome.fasta --thread 15 --outdir /public/HiTE/${line} --annotate 1 --plant 1 --intact_anno 1

CSU-KangHu commented 1 month ago

My HPC network is currently unstable, so I am unable to re-download the image for testing at this moment. I will re-download and test it later. Thank you for your patience.

jwli-code commented 1 month ago

Is there a more convenient way to download sif files to prevent errors caused by network reasons.

CSU-KangHu commented 1 month ago

I tried downloading HiTE and other images from Docker Hub, but all attempts failed, possibly due to network issues with Docker Hub. You might want to try running HiTE using Conda instead, as the installation commands are quite straightforward.

jwli-code commented 1 month ago

Thank you very much