bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
88 stars 18 forks source link

Unable to run poppunk_assign or --fit-model on E. coli database #188

Closed matnguyen closed 2 years ago

matnguyen commented 2 years ago

Versions poppunk 2.4.0 poppunk_sketch 1.7.4

Command used and output returned Downloaded the E. coli database from https://figshare.com/articles/dataset/PopPUNK_databases/6683624?file=12208811

Ran poppunk_assign --db ecoli_poppunk --query rfiles.txt --distances ecoli.h5 --output ecoli --threads 32

Describe the bug

Traceback (most recent call last):
  File "/home/mnguyen/.conda/envs/poppunk-2.4.0/bin/poppunk_assign", line 11, in <module>
    sys.exit(main())
  File "/home/mnguyen/.conda/envs/poppunk-2.4.0/lib/python3.9/site-packages/PopPUNK/assign.py", line 519, in main
    assign_query(dbFuncs,
  File "/home/mnguyen/.conda/envs/poppunk-2.4.0/lib/python3.9/site-packages/PopPUNK/assign.py", line 106, in assign_query
    model = loadClusterFit(model_file + '.pkl',
  File "/home/mnguyen/.conda/envs/poppunk-2.4.0/lib/python3.9/site-packages/PopPUNK/models.py", line 92, in loadClusterFit
    fit_object, fit_type = pickle.load(pickle_obj)
ModuleNotFoundError: No module named 'sklearn.mixture.bayesian_mixture'

I have tried to refit the model from the downloaded database too, but seems like it is missing FileNotFoundError: [Errno 2] Unable to open file (unable to open file: name = 'ecoli_poppunk/ecoli_poppunk.h5', errno = 2, error message = 'No such file or directory', flags = 1, o_flags = 2)

johnlees commented 2 years ago

Sorry, that's an old version of the database. Please try: https://imperialcollegelondon.box.com/s/6koqlrlvtw7a19och2dbel59x9j32r79 (references only, smaller) https://imperialcollegelondon.box.com/s/8irebd9hrguzmtgcvm8zp9f647k5iqzo (all samples, larger, recommended only if you want to update the database)

johnlees commented 2 years ago

Closing issue for now, let us know if further help required