zstephens / neat-genreads

NEAT read simulation tools
Other
95 stars 27 forks source link

error with read length #80

Open alxsm opened 4 years ago

alxsm commented 4 years ago

Hi, I am using this software in coronavirus ADN, it has approximately 30.000 pb, and i want some reads covering whole ADN. But the maximum length that i am able to use is ~=13000, if i try to put more lenght this error appears:

`100% /home/alxsm/.local/lib/python2.7/site-packages/numpy/core/fromnumeric.py:3118: RuntimeWarning: Mean of empty slice. out=out, **kwargs) /home/alxsm/.local/lib/python2.7/site-packages/numpy/core/_methods.py:85: RuntimeWarning: invalid value encountered in double_scalars ret = ret.dtype.type(ret / rcount)

Error: weight or value vector given to DiscreteDistribution() are 0-length.

Traceback (most recent call last): File "genReads.py", line 749, in main() File "genReads.py", line 580, in main coverage_avg = sequences.init_coverage(tuple(coverage_dat)) File "/mnt/c/Users/alex/Desktop/TFGtodo/simulationsd/py/SequenceContainer.py", line 113, in init_coverage self.coverage_distribution.append(DiscreteDistribution(coverage_vals,range(len(coverage_vals)))) File "/mnt/c/Users/alex/Desktop/TFGtodo/simulationsd/py/probability.py", line 23, in init asdf = intentional_crash[0] NameError: global name 'intentional_crash' is not defined`

Do you know how can i fix it?

thank you.

joshfactorial commented 4 years ago

Can you share the command you were trying to run that gave you the error?

zstephens commented 4 years ago

Indeed we would likely need the specific command to debug further, but I'll mention that I have used the read simulator to generate single-end reads that encompass the nearly the entire reference. It still requires a little bit of wiggle room, so I set the read length to 100 less than the reference length.

python genReads.py -r <ref> -R <readlen> -c <cov> -e models/errorModel_pacbio_toy.p -E <error_rate> -p 1 --force-coverage -o <out>

Does your reference contain any stretches of Ns? that would throw a wrench in the works.

alxsm commented 4 years ago

Hi again, sorry i couldn't connect earlier.

My reference doesn't contain any N, and the command i used is:

python2 genReads.py -r ./hapCovid/1.fa -R 20000 -o ./resultados/C1 --bam -e models/errorModel_pacbio_toy.p -c 5 -M 0.02

The reference contains 30.000 bp , and only works with -R 13000 or less