bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
93 stars 20 forks source link

Problem calculating distances with mash #67

Closed nickjcroucher closed 4 years ago

nickjcroucher commented 4 years ago

While trying to calculate a comparison to the sketchlib output (which worked fine with the same input seconds previously, in easy-run mode), I encountered this error - seems like there might be an issue with the new regression function, with mash output, perhaps?

Calculating core and accessory distances
Fitting k-mer curve failed: Residuals are not finite in the initial point.
With mash input [0.061,0.021,0.007,0.002,0.002,0.001,0.   ]
Check for low quality input genomes
Fitting k-mer curve failed: Residuals are not finite in the initial point.
With mash input [0.051,0.013,0.016,0.007,0.001,0.001,0.   ]
Check for low quality input genomes
/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/site-packages/sharedmem/sharedmem.py:249: LostExceptionType: Type information of Unpicklable exception 0 is lost
  warnings.warn("Type information of Unpicklable exception %s is lost" % reason, LostExceptionType)
Traceback (most recent call last):
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/site-packages/sharedmem/sharedmem.py", line 415, in get
    return Q.get(timeout=1)
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/multiprocessing/queues.py", line 105, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/site-packages/sharedmem/sharedmem.py", line 423, in get
    return Q.get(timeout=0)
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/multiprocessing/queues.py", line 105, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/site-packages/sharedmem/sharedmem.py", line 757, in map
    capsule = pg.get(R)
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/site-packages/sharedmem/sharedmem.py", line 425, in get
    raise StopProcessGroup
sharedmem.sharedmem.StopProcessGroup: StopProcessGroup

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "../PopPUNK/poppunk-runner.py", line 10, in <module>
    main()
  File "/Users/nicholascroucher/Documents/PopPUNK/viruses/virus_poppunk/PopPUNK/PopPUNK/__main__.py", line 290, in main
    threads=args.threads)
  File "/Users/nicholascroucher/Documents/PopPUNK/viruses/virus_poppunk/PopPUNK/PopPUNK/mash.py", line 610, in queryDatabase
    pool.map(partial(fitKmerBlock, distMat=distMat, raw = raw, klist=klist, jacobian=jacobian), mat_chunks)
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/site-packages/sharedmem/sharedmem.py", line 761, in map
    raise pg.get_exception()
sharedmem.sharedmem.SlaveException: 0
Traceback (most recent call last):
  File "/Users/nicholascroucher/Documents/PopPUNK/viruses/virus_poppunk/PopPUNK/PopPUNK/mash.py", line 661, in fitKmerCurve
    bounds=([-np.inf, -np.inf], [0, 0]))
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/site-packages/scipy/optimize/_lsq/least_squares.py", line 814, in least_squares
    raise ValueError("Residuals are not finite in the initial point.")
ValueError: Residuals are not finite in the initial point.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/site-packages/sharedmem/sharedmem.py", line 294, in _slaveMain
    self.main(self, *self.args)
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/site-packages/sharedmem/sharedmem.py", line 628, in _main
    r = realfunc(work)
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/site-packages/sharedmem/sharedmem.py", line 703, in realfunc
    else: return func(i)
  File "/Users/nicholascroucher/Documents/PopPUNK/viruses/virus_poppunk/PopPUNK/PopPUNK/mash.py", line 633, in fitKmerBlock
    distMat[start:end, :] = np.apply_along_axis(fitKmerCurve, 1, raw[start:end, :], klist, jacobian)
  File "<__array_function__ internals>", line 6, in apply_along_axis
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/site-packages/numpy/lib/shape_base.py", line 402, in apply_along_axis
    buff[ind] = asanyarray(func1d(inarr_view[ind], *args, **kwargs))
  File "/Users/nicholascroucher/Documents/PopPUNK/viruses/virus_poppunk/PopPUNK/PopPUNK/mash.py", line 668, in fitKmerCurve
    exit(0)
  File "/Users/nicholascroucher/miniconda3/envs/poppunk/lib/python3.7/_sitebuiltins.py", line 26, in __call__
    raise SystemExit(code)
SystemExit: 0
johnlees commented 4 years ago

Hmm, probably because log(0) (matches at k = 29) is undefined and is giving an error. Try lowering the max k-mer size?