EMBL-PKU / BASALT

MIT License
76 stars 13 forks source link

IndexError: list index out of range #8

Open eperezv opened 5 months ago

eperezv commented 5 months ago

Hello,

I'm trying BASALT with a subset of my data. It looked to work fine until the step "Comparing bins before and after refining process" that failed with the error in the title.


Comparing bins with retrieved bins  
Parsing BestBinset_outlier_refined_filtrated_retrieved_checkm output  
Comparing bins before and after refining process  
Traceback (most recent call last):  
 File "/home/eduardo/miniconda3/envs/BASALT/bin/BASALT", line 141, in <module>  
   BASALT_main_c(assembly_list, datasets, num_threads, lr_list, hifi_list, hic_list, eb_list, ram, continue_mode, functional_module, autobining_parameters, refinement_paramter, max_ctn, m  
in_cpn, pwd, QC_software)  
 File "/home/eduardo/miniconda3/envs/BASALT/bin/BASALT_main_c.py", line 460, in BASALT_main_c  
   Contig_recruiter_main(best_binset_from_multi_assemblies, outlier_remover_folder, num_threads, continue_mode, min_cpn, max_ctn, assembly_mo_list, connections_list, coverage_matrix_list,  
refinement_paramter, pwd)  
 File "/home/eduardo/miniconda3/envs/BASALT/bin/S6_retrieve_contigs_from_PE_contigs_checkm.py", line 1551, in Contig_recruiter_main  
   parse_bin_in_bestbinset(assemblies_list, binset+'_filtrated', outlier_remover_folder, PE_connections_list, num_threads, last_step, coverage_matrix_list, refinement_mode)  
 File "/home/eduardo/miniconda3/envs/BASALT/bin/S6_retrieve_contigs_from_PE_contigs_checkm.py", line 1417, in parse_bin_in_bestbinset  
   bin_comparison(str(binset), bins_checkm, str(binset)+'_retrieved', refinement_mode, num_threads)  
 File "/home/eduardo/miniconda3/envs/BASALT/bin/S6_retrieve_contigs_from_PE_contigs_checkm.py", line 378, in bin_comparison  
   taxon=str(line).strip().split('lineage')[1].split('\'')[2].strip()  
IndexError: list index out of range```
eperezv commented 4 months ago

Hi, The issue I reported before is now resolved (I run it again and it didn't fail). However, I'm having another issue:

Judging 2_C1-c2-c3-MEGAHIT.assembly.fa_0.3_maxbin2_genomes.496.fa
Judging 2_C1-c2-c3-MEGAHIT.assembly.fa_0.3_maxbin2_genomes.123.fa
Judging 2_C1-c2-c3-MEGAHIT.assembly.fa_0.3_maxbin2_genomes.488.fa
Judging 2_C1-c2-c3-MEGAHIT.assembly.fa_0.3_maxbin2_genomes.36.fa
Judging 2_C1-c2-c3-MEGAHIT.assembly.fa_0.3_maxbin2_genomes.216.fa
Judging 1_C1-2-3-MHM2.contigs.fa_200_concoct_genomes.8.fa
Re-mapping
Settings:
  Output files: "Remapping.fasta.*.bt2"
  Line rate: 6 (line is 64 bytes)
  Lines per side: 1 (side is 64 bytes)
  Offset rate: 4 (one in 16)
  FTable chars: 10
  Strings: unpacked
  Max bucket size: default
  Max bucket size, sqrt multiplier: default
  Max bucket size, len divisor: 4
  Difference-cover sample period: 1024
  Endianness: little
  Actual local endianness: little
  Sanity checking: disabled
  Assertions: disabled
  Random seed: 0
  Sizeofs: void*:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
  Remapping.fasta
Warning: Empty fasta file: 'Remapping.fasta'
Warning: All fasta inputs were empty
Total time for call to driver() for forward index: 00:00:00
Error: Encountered internal Bowtie 2 exception (#1)
Command: /home/eduardo/miniconda3/envs/BASALT/bin/bowtie2-build-s --wrapper basic-0 Remapping.fasta Remapping.fasta 
Mapping ['C1_1.clean2.fq', 'C1_2.clean2.fq']
(ERR): "Remapping.fasta" does not exist or is not a Bowtie 2 index
Exiting now ...
[E::hts_open_format] Failed to open file 1.sam
samtools view: failed to open "1.sam" for reading: No such file or directory
[bam_sort] Use -T PREFIX / -o FILE to specify temporary and final output files
Usage: samtools sort [options...] [in.bam]
Options:
  -l INT     Set compression level, from 0 (uncompressed) to 9 (best)
  -m INT     Set maximum memory per thread; suffix K/M/G recognized [768M]
  -n         Sort by read name
  -t TAG     Sort by value of TAG. Uses position as secondary index (or read name if -n is set)
  -o FILE    Write final output to FILE rather than standard output
  -T PREFIX  Write temporary files to PREFIX.nnnn.bam
      --input-fmt-option OPT[=VAL]
               Specify a single input file format option in the form
               of OPTION or OPTION=VALUE
  -O, --output-fmt FORMAT[,OPT[=VAL]]...
               Specify output format (SAM, BAM, CRAM)
      --output-fmt-option OPT[=VAL]
               Specify a single output file format option in the form
               of OPTION or OPTION=VALUE
      --reference FILE
               Reference sequence FASTA FILE [null]
  -@, --threads INT
               Number of additional threads to use [0]
Samtools sorting 1.bam failed. Redoing
[E::hts_open_format] Failed to open file 1.bam
samtools sort: can't open "1.bam": No such file or directory
rm: cannot remove '1.sam': No such file or directory
Output depth matrix to Re-mapped_depth.txt
Output matrix to Re-mapped_depth.txt
Opening 1 bams
[E::hts_open] fail to open file '1_sorted.bam'
Consolidating headers
Segmentation fault (core dumped)
Output depth matrix to Re-mapped_depth.txt
Output matrix to Re-mapped_depth.txt
Opening 1 bams
[E::hts_open] fail to open file '1_sorted.bam'
Consolidating headers
Segmentation fault (core dumped)
rm: cannot remove '1_sorted.bam': No such file or directory
rm: cannot remove '*.bt2': No such file or directory
rm: cannot remove '1.bam': No such file or directory
Traceback (most recent call last):
  File "/home/eduardo/miniconda3/envs/BASALT/bin/BASALT", line 137, in <module>
    BASALT_main_d(assembly_list, datasets, num_threads, lr_list, hifi_list, hic_list, eb_list, ram, continue_mode, functional_module, autobining_parameters, refinement_paramter, max_ctn, min_cpn, pwd, QC_software)
  File "/home/eduardo/miniconda3/envs/BASALT/bin/BASALT_main_d.py", line 453, in BASALT_main_d
    outlier_remover_main('BestBinset', coverage_matrix_list, datasets, lr_list, hifi_list, assembly_mo_list, pwd, num_threads)
  File "/home/eduardo/miniconda3/envs/BASALT/bin/S5_Outlier_remover_DL_11012023.py", line 549, in outlier_remover_main
    A=outlier_predictor(depth_TNF_matrix, contigs_depth, bin_contigs, datasets, lr, hifi_list, num_threads, nx)
  File "/home/eduardo/miniconda3/envs/BASALT/bin/S5_Outlier_remover_DL_11012023.py", line 394, in outlier_predictor
    for line in open('Re-mapped_depth.txt','r'):
FileNotFoundError: [Errno 2] No such file or directory: 'Re-mapped_depth.txt'

I would appreciate any help to solve it. Thank you

EMBL-PKU commented 3 months ago

Did you try the latest version of BASALT? I may fixed the problem in the latest version, but I am not quite sure. Please let me know if you still meet with the same question. We will try to fix it.