bcgsc / NanoSim

Nanopore sequence read simulator
Other
243 stars 57 forks source link

head_vs_ht_ratio = head_vs_ht_ratio_list[each_read] IndexError: list index out of range #228

Open huangxin0221 opened 1 month ago

huangxin0221 commented 1 month ago

My command is:

simulator.py genome --ref_g $file \ --model_prefix ${Model_Output} \ --output ${output_dir1} \ --number 300 \ --max_len ${max_len} \ --fastq --num_threads 1

But the following error occur (latest version of NanoSim):

Traceback (most recent call last): File ".conda/envs/NanoSim/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File ".conda/envs/NanoSim/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File ".conda/envs/NanoSim/bin/simulator.py", line 1349, in simulation_aligned_genome head_vs_ht_ratio = head_vs_ht_ratio_list[each_read] IndexError: list index out of range

lcoombe commented 1 month ago

Hi @huangxin0221,

Could you please provide us with more information, including:

Thank you for your interest in NanoSim! Lauren

RunpengLuo commented 1 month ago

Hi @lcoombe,

I have the same issue. I'm using NanoSim version 3.2.1 (installed via Conda on linux 64), and the command I used:

simulator.py genome -n 40000 -min 10000 --seed 0 --fastq -t 8 \
    -rg ${CL0_CHR1A} -c ${PRETRAIN_MODEL} -o ${OUTDIR}

My genome size is about 373MB, it is a simulated genome from grch38 reference chr1. I used pretrained model human_giab_hg002_sub1M_kitv14_dorado_v3.2.1. And I've attached the output as below. I tried 8 or 16 threads but they have same error.

Thanks for your help! John

running the code with following parameters:

ref_g ~/sim_grch38_chr1_simple/fasta/clone0.paternal.fa
model_prefix ~/NanoSim/pre-trained_models/human_giab_hg002_sub1M_kitv14_dorado_v3.2.1/training
out ~/sim_grch38_chr1_simple/fastq/round1/sim_error_clone0A_10x/error_clone0A_10x
number [40000]
perfect False
homopolymer False
dna_type linear
strandness None
sd_len None
median_len None
max_len inf
min_len 10000
fastq True
chimeric False
num_threads 8
2024-10-07 05:47:31: ~/miniconda3/envs/nanosim/bin/simulator.py genome -n 40000 -min 10000 --seed 0 --fastq -t 8 -rg ~/sim_grch38_chr1_simple/fasta/clone0.paternal.fa -c ~/NanoSim/pre-trained_models/human_giab_hg002_sub1M_kitv14_dorado_v3.2.1/training -o ~/sim_grch38_chr1_simple/fastq/round1/sim_error_clone0A_10x/error_clone0A_10x
2024-10-07 05:47:31: Read in reference 
2024-10-07 05:47:32: Read error profile
2024-10-07 05:47:32: Read KDF of unaligned reads
~/miniconda3/envs/nanosim/lib/python3.7/site-packages/sklearn/base.py:318: UserWarning: Trying to unpickle estimator KernelDensity from version 0.23.2 when using version 0.22.1. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
2024-10-07 05:47:32: Read KDF of aligned reads
2024-10-07 05:47:32: Read chimeric simulation information
2024-10-07 05:47:32: Start simulation of aligned reads
Process Process-7:0: Number of reads simulated >> 30001
Traceback (most recent call last):
  File "~/miniconda3/envs/nanosim/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "~/miniconda3/envs/nanosim/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "~/miniconda3/envs/nanosim/bin/simulator.py", line 1349, in simulation_aligned_genome
    head_vs_ht_ratio = head_vs_ht_ratio_list[each_read]
IndexError: list index out of range
Process Process-8:
Traceback (most recent call last):
  File "~/miniconda3/envs/nanosim/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "~/miniconda3/envs/nanosim/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "~/miniconda3/envs/nanosim/bin/simulator.py", line 1349, in simulation_aligned_genome
    head_vs_ht_ratio = head_vs_ht_ratio_list[each_read]
IndexError: list index out of range

2024-10-07 05:49:53: Start simulation of random reads

2024-10-07 05:49:55: Finished!
lcoombe commented 1 month ago

Hi @RunpengLuo, Thanks for the detailed information and log - was very helpful to trace the issue!

I have a tentative fix in #233, which will hopefully be merged to master branch later today. I will also update when the fix is integrated in a new release - but feel free to test out that code in the meantime to see if it fixes your error!

lcoombe commented 1 month ago

The fix has been included in the newly released v3.2.2