bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
88 stars 18 forks source link

AttributeError when refining a model #171

Closed mgalardini closed 3 years ago

mgalardini commented 3 years ago

Versions

poppunk 2.3.0
poppunk_sketch 1.7.3

Command used and output returned

$ poppunk --create-db --output database --r-files out/poppunk_input.txt --threads 24 --output poppunk --qc-filter continue --overwrite
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
        (with backend: sketchlib v1.7.3
         sketchlib: /fast-storage/miniconda3/envs/poppunk/lib/python3.8/site-packages/pp_sketchlib.cpython-38-x86_64-linux-gnu.so)

Graph-tools OpenMP parallelisation enabled: with 24 threads
Mode: Building new database from input sequences
Sketching 392 genomes using 24 thread(s)
Progress (CPU): 392 / 392
Writing sketches to file
Calculating random match chances using Monte Carlo
Calculating distances using 24 thread(s)
Progress (CPU): 100.0%

Done
$ poppunk --fit-model bgmm --ref-db poppunk --threads 24 --K 4
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
        (with backend: sketchlib v1.7.3
         sketchlib: /fast-storage/miniconda3/envs/poppunk/lib/python3.8/site-packages/pp_sketchlib.cpython-38-x86_64-linux-gnu.so)

Graph-tools OpenMP parallelisation enabled: with 24 threads
Mode: Fitting bgmm model to reference database

Fit summary:
        Avg. entropy of assignment      0.0130
        Number of components used       4

Scaled component means:
        [0.22146682 0.5829107 ]
        [0.4475713  0.63576613]
        [0.52116282 0.66185159]
        [0.00026122 0.2556857 ]

Network summary:
        Components      115
        Density 0.0364
        Transitivity    0.9952
        Score   0.9589
Removing 261 sequences

Done
$ poppunk --fit-model refine --ref-db poppunk --threads 24
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
        (with backend: sketchlib v1.7.3
         sketchlib: /fast-storage/miniconda3/envs/poppunk/lib/python3.8/site-packages/pp_sketchlib.cpython-38-x86_64-linux-gnu.so)

Graph-tools OpenMP parallelisation enabled: with 24 threads
Mode: Fitting refine model to reference database

Loading BGMM 2D Gaussian model
Loaded previous model of type: bgmm
Initial model-based network construction based on Gaussian fit
Initial boundary based network construction
Decision boundary starts at (0.03,0.30)
Trying to optimise score globally
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/fast-storage/miniconda3/envs/poppunk/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/fast-storage/miniconda3/envs/poppunk/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/fast-storage/miniconda3/envs/poppunk/lib/python3.8/site-packages/PopPUNK/refine.py", line 179, in newNetwork
    boundary_assignments = pp_sketchlib.assignThreshold(distMat, slope, x_max, y_max, cpus)
AttributeError: module 'pp_sketchlib' has no attribute 'assignThreshold'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/fast-storage/miniconda3/envs/poppunk/bin/poppunk", line 10, in <module>
    sys.exit(main())
  File "/fast-storage/miniconda3/envs/poppunk/lib/python3.8/site-packages/PopPUNK/__main__.py", line 363, in main
    assignments = new_model.fit(distMat, refList, model,
  File "/fast-storage/miniconda3/envs/poppunk/lib/python3.8/site-packages/PopPUNK/models.py", line 609, in fit
    self.start_point, self.optimal_x, self.optimal_y, self.min_move, self.max_move = refineFit(X/self.scale,
  File "/fast-storage/miniconda3/envs/poppunk/lib/python3.8/site-packages/PopPUNK/refine.py", line 100, in refineFit
    global_s = pool.map(partial(newNetwork,
  File "/fast-storage/miniconda3/envs/poppunk/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/fast-storage/miniconda3/envs/poppunk/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
AttributeError: module 'pp_sketchlib' has no attribute 'assignThreshold'

Describe the bug

The above example should be self-explanatory. I am also fine with this error since the model does not seem to need to be refined but other users may have this need.

johnlees commented 3 years ago

Sorry, this is due to our not ideal versioning of pp-sketchlib and PopPUNK. This would work with PopPUNK >= v2.4.0, or with sketchlib <=1.6.2. I will try and fix the version pinning in the next conda release

mgalardini commented 3 years ago

Ah I see, thanks for your reply. I'll try updating poppunk then

mgalardini commented 3 years ago

Can confirm that updating to poppunk 2.4.0 solved the issue