bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
87 stars 17 forks source link

Append distance based QC fails to qcreport #233

Closed jdaeth274 closed 1 year ago

jdaeth274 commented 1 year ago

Hi guys

More of a feature request. Would it be possible to append isolate ids that fail QC based on max-pi-dist and max-a-dist to the output qcreport.txt file with other isolates that fail length and Ns QC? Currently, the ids for these distance-based outliers are just printed to the screen after distance calculation in --create-db mode.

That's for me running:

johnlees commented 1 year ago

Would it be possible to give it a go with v2.5.0 of PopPUNK? We re-did the QC command: https://poppunk.readthedocs.io/en/latest/qc.html

If that's not doing what you want, let us know and we'll add this!

jdaeth274 commented 1 year ago

Thanks John

Sorry I'm just running into an error now when I've upgraded to v2.5.0. Seems to be from HDBScan, not sure if this is related to issue #213.

Error message

$ poppunk --create-db --r-files cdip_genbank_refseq_ukhsa_836_pop.tsv --output cdip_836_poppunk_2.5 --max-pi-dist 0.03 
Traceback (most recent call last):
  File "/home/phe.gov.uk/joshua.daeth/anaconda3/envs/poppunk_env/bin/poppunk", line 11, in <module>
    sys.exit(main())
  File "/home/phe.gov.uk/joshua.daeth/anaconda3/envs/poppunk_env/lib/python3.10/site-packages/PopPUNK/__main__.py", line 213, in main
    from .models import loadClusterFit, BGMMFit, DBSCANFit, RefineFit, LineageFit
  File "/home/phe.gov.uk/joshua.daeth/anaconda3/envs/poppunk_env/lib/python3.10/site-packages/PopPUNK/models.py", line 19, in <module>
    import hdbscan
  File "/home/phe.gov.uk/joshua.daeth/anaconda3/envs/poppunk_env/lib/python3.10/site-packages/hdbscan/__init__.py", line 1, in <module>
    from .hdbscan_ import HDBSCAN, hdbscan
  File "/home/phe.gov.uk/joshua.daeth/anaconda3/envs/poppunk_env/lib/python3.10/site-packages/hdbscan/hdbscan_.py", line 509, in <module>
    memory=Memory(cachedir=None, verbose=0),
TypeError: Memory.__init__() got an unexpected keyword argument 'cachedir'

Versions

Conda list

# packages in environment at /home/phe.gov.uk/joshua.daeth/anaconda3/envs/poppunk_env:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
aom                       3.5.0                h27087fc_0    conda-forge
at-spi2-atk               2.38.0               h0630a04_3    conda-forge
at-spi2-core              2.40.3               h0630a04_0    conda-forge
atk-1.0                   2.36.0               h3371d22_4    conda-forge
biopython                 1.79            py310h5764c6d_2    conda-forge
boost                     1.74.0          py310h7c3ba0c_5    conda-forge
boost-cpp                 1.74.0               h75c5d50_8    conda-forge
brotli                    1.0.9                h166bdaf_7    conda-forge
brotli-bin                1.0.9                h166bdaf_7    conda-forge
brotlipy                  0.7.0           py310h5764c6d_1004    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.18.1               h7f98852_0    conda-forge
ca-certificates           2022.9.24            ha878542_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
cairo                     1.16.0            ha61ee94_1014    conda-forge
cairomm                   1.14.4               ha770c72_0    conda-forge
cairomm-1.0               1.14.4               h09cb3b9_0    conda-forge
certifi                   2022.9.24          pyhd8ed1ab_0    conda-forge
cffi                      1.15.1          py310h255011f_0    conda-forge
charset-normalizer        2.1.1              pyhd8ed1ab_0    conda-forge
colorama                  0.4.5              pyhd8ed1ab_0    conda-forge
contourpy                 1.0.5           py310hbf28c38_0    conda-forge
cryptography              37.0.1          py310h9ce1e76_0  
cycler                    0.11.0             pyhd8ed1ab_0    conda-forge
dbus                      1.13.6               h5008d03_3    conda-forge
dendropy                  4.5.2              pyh3252c3a_0    bioconda
docopt                    0.6.2                      py_1    conda-forge
epoxy                     1.5.10               h166bdaf_1    conda-forge
expat                     2.4.9                h27087fc_0    conda-forge
ffmpeg                    5.1.2           gpl_he10e716_101    conda-forge
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 hab24e00_0    conda-forge
fontconfig                2.14.0               hc2a2eb6_1    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.37.4          py310h5764c6d_0    conda-forge
freetype                  2.12.1               hca18f0e_0    conda-forge
fribidi                   1.0.10               h36c2ea0_0    conda-forge
gdk-pixbuf                2.42.8               hff1cb4f_1    conda-forge
gettext                   0.21.1               h27087fc_0    conda-forge
glib-tools                2.74.0               h6239696_0    conda-forge
gmp                       6.2.1                h58526e2_0    conda-forge
gnutls                    3.7.8                hf3e180e_0    conda-forge
graph-tool                2.45            py310haee70ea_2    conda-forge
graph-tool-base           2.45            py310hd8094d8_2    conda-forge
graphite2                 1.3.13            h58526e2_1001    conda-forge
gtk3                      3.24.34              h4d20fae_1    conda-forge
h5py                      3.7.0           nompi_py310h416281c_101    conda-forge
harfbuzz                  5.3.0                h418a68e_0    conda-forge
hdbscan                   0.8.28          py310h96516ba_1    conda-forge
hdf5                      1.12.2          nompi_h4df4325_100    conda-forge
hicolor-icon-theme        0.17                 ha770c72_2    conda-forge
icu                       70.1                 h27087fc_0    conda-forge
idna                      3.4                pyhd8ed1ab_0    conda-forge
joblib                    1.2.0              pyhd8ed1ab_0    conda-forge
jpeg                      9e                   h166bdaf_2    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.4           py310hbf28c38_0    conda-forge
krb5                      1.19.3               h08a2579_0    conda-forge
lame                      3.100             h166bdaf_1003    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.39                 hc81fddc_0    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libblas                   3.9.0           16_linux64_openblas    conda-forge
libbrotlicommon           1.0.9                h166bdaf_7    conda-forge
libbrotlidec              1.0.9                h166bdaf_7    conda-forge
libbrotlienc              1.0.9                h166bdaf_7    conda-forge
libcblas                  3.9.0           16_linux64_openblas    conda-forge
libcups                   2.3.3                h3e49a29_2    conda-forge
libcurl                   7.85.0               h2283fc2_0    conda-forge
libdeflate                1.14                 h166bdaf_0    conda-forge
libdrm                    2.4.113              h166bdaf_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 12.2.0              h65d4601_18    conda-forge
libgfortran-ng            12.2.0              h69a702a_18    conda-forge
libgfortran5              12.2.0              h337968e_18    conda-forge
libgirepository           1.74.0               ha2a38d2_0    conda-forge
libglib                   2.74.0               h7a41b64_0    conda-forge
libgomp                   12.2.0              h65d4601_18    conda-forge
libiconv                  1.17                 h166bdaf_0    conda-forge
libidn2                   2.3.3                h166bdaf_0    conda-forge
liblapack                 3.9.0           16_linux64_openblas    conda-forge
libnghttp2                1.47.0               hff17c54_1    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libopenblas               0.3.21          pthreads_h78a6416_3    conda-forge
libpciaccess              0.16                 h516909a_0    conda-forge
libpng                    1.6.38               h753d276_0    conda-forge
librsvg                   2.54.4               h7abd40a_0    conda-forge
libsqlite                 3.39.4               h753d276_0    conda-forge
libssh2                   1.10.0               hf14f497_3    conda-forge
libstdcxx-ng              12.2.0              h46fd767_18    conda-forge
libtasn1                  4.19.0               h166bdaf_0    conda-forge
libtiff                   4.4.0                h55922b4_4    conda-forge
libunistring              0.9.10               h7f98852_0    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libva                     2.16.0               h166bdaf_0    conda-forge
libvpx                    1.11.0               h9c3ff4c_3    conda-forge
libwebp-base              1.2.4                h166bdaf_0    conda-forge
libxcb                    1.13              h7f98852_1004    conda-forge
libxml2                   2.9.14               h22db469_4    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
mandrake                  1.2.2           py310h7dbff7e_1    conda-forge
matplotlib-base           3.6.1           py310h8d5ebf3_0    conda-forge
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
ncurses                   6.3                  h27087fc_1    conda-forge
nettle                    3.8.1                hc379101_1    conda-forge
networkx                  2.8.7              pyhd8ed1ab_0    conda-forge
numpy                     1.23.3          py310h53a5b5f_0    conda-forge
openblas                  0.3.21          pthreads_h320a7e8_3    conda-forge
openh264                  2.3.1                h27087fc_1    conda-forge
openjpeg                  2.5.0                h7d73246_1    conda-forge
openssl                   3.0.5                h166bdaf_2    conda-forge
p11-kit                   0.24.1               hc5aa10d_0    conda-forge
packaging                 21.3               pyhd8ed1ab_0    conda-forge
pandas                    1.5.0           py310h769672d_0    conda-forge
pango                     1.50.11              h382ae3d_0    conda-forge
pcre2                     10.37                hc3806b6_1    conda-forge
pillow                    9.2.0           py310hbd86126_2    conda-forge
pip                       22.3               pyhd8ed1ab_0    conda-forge
pixman                    0.40.0               h36c2ea0_0    conda-forge
plotly                    5.10.0             pyhd8ed1ab_0    conda-forge
poppunk                   2.5.0           py310h2579afa_0    bioconda
pp-sketchlib              2.0.0           py310h5a37817_2    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
pycairo                   1.21.0          py310h96fc21a_1    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pygobject                 3.42.2          py310h964465f_0    conda-forge
pyopenssl                 22.0.0             pyhd8ed1ab_1    conda-forge
pyparsing                 3.0.9              pyhd8ed1ab_0    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
python                    3.10.6          ha86cf86_0_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.10                    2_cp310    conda-forge
pytz                      2022.4             pyhd8ed1ab_0    conda-forge
rapidnj                   2.3.2                h9f5acd7_2    bioconda
readline                  8.1.2                h0f457ee_0    conda-forge
requests                  2.28.1             pyhd8ed1ab_1    conda-forge
scikit-learn              1.1.2           py310h0c3af53_0    conda-forge
scipy                     1.9.1           py310hdfbd76f_0    conda-forge
setuptools                65.5.0             pyhd8ed1ab_0    conda-forge
sigcpp-2.0                2.10.8               h27087fc_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
sparsehash                2.0.4                h9c3ff4c_0    conda-forge
svt-av1                   1.2.1                h27087fc_0    conda-forge
tenacity                  8.1.0              pyhd8ed1ab_0    conda-forge
threadpoolctl             3.1.0              pyh8a188c0_0    conda-forge
tk                        8.6.12               h27826a3_0    conda-forge
tqdm                      4.64.1             pyhd8ed1ab_0    conda-forge
treeswift                 1.1.28             pyh5e36f6f_0    bioconda
tzdata                    2022e                h191b570_0    conda-forge
unicodedata2              14.0.0          py310h5764c6d_1    conda-forge
urllib3                   1.26.11            pyhd8ed1ab_0    conda-forge
wheel                     0.37.1             pyhd8ed1ab_0    conda-forge
x264                      1!164.3095           h166bdaf_2    conda-forge
x265                      3.5                  h924138e_3    conda-forge
xorg-compositeproto       0.4.2             h7f98852_1001    conda-forge
xorg-damageproto          1.2.1             h7f98852_1002    conda-forge
xorg-fixesproto           5.0               h7f98852_1002    conda-forge
xorg-inputproto           2.3.2             h7f98852_1002    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.0.10               h7f98852_0    conda-forge
xorg-libsm                1.2.3             hd9c2040_1000    conda-forge
xorg-libx11               1.6.12               h36c2ea0_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxaw               1.0.14               h7f98852_0    conda-forge
xorg-libxcomposite        0.4.5                h7f98852_0    conda-forge
xorg-libxcursor           1.2.0                h516909a_0    conda-forge
xorg-libxdamage           1.1.5                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h516909a_0    conda-forge
xorg-libxfixes            5.0.3             h516909a_1004    conda-forge
xorg-libxi                1.7.10               h516909a_0    conda-forge
xorg-libxinerama          1.1.4             h9c3ff4c_1001    conda-forge
xorg-libxmu               1.1.3                h516909a_0    conda-forge
xorg-libxpm               3.5.13               h516909a_0    conda-forge
xorg-libxrandr            1.5.2                h516909a_1    conda-forge
xorg-libxrender           0.9.10            h516909a_1002    conda-forge
xorg-libxt                1.1.5             h516909a_1003    conda-forge
xorg-libxtst              1.2.3             h516909a_1002    conda-forge
xorg-randrproto           1.5.0             h7f98852_1001    conda-forge
xorg-recordproto          1.14.2            h7f98852_1002    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-util-macros          1.19.3               h7f98852_0    conda-forge
xorg-xextproto            7.3.0             h7f98852_1002    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
zlib                      1.2.13               h166bdaf_4    conda-forge
zstandard                 0.18.0          py310h5764c6d_0    conda-forge
zstd                      1.5.2                h6239696_4    conda-forge
nickjcroucher commented 1 year ago

conda install "joblib<=1.1" should fix that

jdaeth274 commented 1 year ago

Thanks Nick

That now works, the updated QC output is good too. Cheers