bcgsc / NanoSim

Nanopore sequence read simulator
Other
217 stars 51 forks source link

ValueError: operands could not be broadcast together with shapes #126

Closed 865699871 closed 2 years ago

865699871 commented 3 years ago

Process Process-1: Traceback (most recent call last): File "/home/shgao/.conda/envs/NanoSim/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/shgao/.conda/envs/NanoSim/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "simulator.py", line 1176, in simulation_aligned_genome ref_lengths = total_lengths - remainder_lengths ValueError: operands could not be broadcast together with shapes (39610,) (51448,)

865699871 commented 3 years ago

59 Jun 10 20:40 simulated_aligned_error_profile 0 Jun 10 20:40 simulated_aligned_reads.fastq 27106870 Jun 10 20:43 simulated_unaligned_reads.fastq

cheny19 commented 3 years ago

Could you paste the command that you use that leads to this error?

865699871 commented 3 years ago

Thank you for your prompt reply @cheny19. I got NanoSim from conda cloud.

Here are commands for read_analysis and simulator

/home/shgao/.conda/envs/NanoSim/bin/python /home/shgao/.conda/envs/NanoSim/bin/read_analysis.py genome -i $workdir'/ont.fastq' -rg $workdir'/ref1.fna' -t 16 -o $workdir'/train'

/home/shgao/.conda/envs/NanoSim/bin/python simulator.py genome -rg $REF -c $workdir'/train' -med 10000 -sd 0.5 -n 20000 -o $workdir'/simulated'

Here are some log info.

2021-06-10 21:17:03: Read in reference 2021-06-10 21:17:03: Read error profile 2021-06-10 21:17:04: Read KDF of unaligned reads 2021-06-10 21:17:19: Read KDF of aligned reads 2021-06-10 21:17:19: Read chimeric simulation information 2021-06-10 21:17:19: Simulating read length with log-normal distribution 2021-06-10 21:17:19: Start simulation of aligned reads

2021-06-10 21:17:20: Start simulation of random reads 2021-06-10 21:19:24: Number of reads simulated >> 300 2021-06-10 21:20:00: Finished!

Here is error report and if i use multicore, every process will have this error. Process Process-1: Traceback (most recent call last): File "/home/shgao/.conda/envs/NanoSim/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/shgao/.conda/envs/NanoSim/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "simulator.py", line 1176, in simulation_aligned_genome ref_lengths = total_lengths - remainder_lengths ValueError: operands could not be broadcast together with shapes (39610,) (51447,)

cheny19 commented 3 years ago

Could you try clone the Github repo and give it another shot? The release version on Conda has a few small bugs, although none of them gives such error info, I'd suggest using the latest commit a try and see if it solves the problem.

865699871 commented 3 years ago

Thank you, Chen, I read your code of version 3.0 In simulator.py def simulation_aligned_genome() *remainder_lengths = get_length_kde(kde_ht, int(remaining_reads 1.3), True) remainder_lengths = [x for x in remainder_lengths if x >= 0]**

This may cause shape change between remaining_reads and remainder_lengths

total_lengths = np.random.lognormal(np.log(median_l + sd_l 2 / 2), sd_l, remaining_reads)** the shape of total_lengths same with remaining_reads, but not same with remainder_lengths.

So, the error occur in:

ref_lengths = total_lengths - remainder_lengths

I check these code in latest commit and found they same.

However, I see the version 2.6 which is different between 3.0.

remainder_l = get_length_kde(kde_ht, i, True)

There is no filter after this.

total_l = np.random.lognormal(np.log(median_l + sd_l 2 / 2), sd_l, i) ref_l = total_l - remainder_l**

Is it the reason? And I will try version 2.6.

cheny19 commented 3 years ago

Hi @865699871,

Thanks for looking into this problem. I have just fixed the bug and pushed to Github. Please try the latest commit and see how it goes.

Thanks, Chen

SaberHQ commented 2 years ago

Hey @865699871 I hope @cheny19's solution was helpful and you were able to use NanoSim for your analysis. Please clone the latest and most updated version of NanoSim.

I am closing this issue for now, if you still need help, please feel free to reopen it and we will be more than happy to help you with anything. Thanks.