bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
89 stars 18 forks source link

Illegal instruction (core dumped) when running PopPUNK #127

Closed dmgie closed 3 years ago

dmgie commented 3 years ago

Hi,

I've been trying to get PopPUNK running on a server to no avail - with the main error being:

PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
        (with backend: sketchlib v1.5.3
         sketchlib: /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/pp_sketchlib.cpython-38-x86_64-linux-gnu.so)
Mode: Creating clusters from assemblies (create_db & fit_model)
Illegal instruction (core dumped)

all of this being on a newly made conda environment

I've seen #106 had a similar problem and tried following the steps that were mentioned there, but none of them have solved the issue. Looking at the apport.log file from the dumped core it only states:

ERROR: apport (pid 25164) Thu Nov 19 21:51:23 2020: writing core dump to core (limit: -1)
ERROR: apport (pid 26346) Thu Nov 19 21:54:28 2020: called for pid 26323, signal 4, core limit 18446744073709551615, dump mode 1
ERROR: apport (pid 26346) Thu Nov 19 21:54:28 2020: ignoring implausibly big core limit, treating as unlimited
ERROR: apport (pid 26346) Thu Nov 19 21:54:28 2020: script: /home/ubuntu/anaconda3/envs/ppunk/bin/poppunk, interpreted by /home/ubuntu/anaconda3/envs/ppunk/bin/python3.8 (command line "/home/ubuntu/anaconda3/envs/ppunk/bin/python3.8 /home/ubuntu/anaconda3/envs/ppunk/bin/poppunk --easy-run --r-files reference_list.txt --output lm_example/ --threads 8 --plot-fit 5 --min-k 13 --full-db")
ERROR: apport (pid 26346) Thu Nov 19 21:54:28 2020: executable does not belong to a package, ignoring

I've tried compiling pp-sketchlib from scratch as I've read that might cause some problems (especially regarding the CPU) but it presents its own problem when trying to build it. Running python3 setup.py install in the cloned pp-sketchlib repository yields:

running install
running bdist_egg
running egg_info
writing pp_sketchlib.egg-info/PKG-INFO
writing dependency_links to pp_sketchlib.egg-info/dependency_links.txt
writing entry points to pp_sketchlib.egg-info/entry_points.txt
writing requirements to pp_sketchlib.egg-info/requires.txt
writing top-level names to pp_sketchlib.egg-info/top_level.txt
reading manifest file 'pp_sketchlib.egg-info/SOURCES.txt'
writing manifest file 'pp_sketchlib.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build
creating build/lib.linux-x86_64-3.8
creating build/lib.linux-x86_64-3.8/pp_sketch
copying pp_sketch/matrix.py -> build/lib.linux-x86_64-3.8/pp_sketch
copying pp_sketch/__main__.py -> build/lib.linux-x86_64-3.8/pp_sketch
copying pp_sketch/__init__.py -> build/lib.linux-x86_64-3.8/pp_sketch
running build_ext
-- The C compiler identification is GNU 5.5.0
-- The CXX compiler identification is GNU 5.5.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /home/linuxbrew/.linuxbrew/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /home/linuxbrew/.linuxbrew/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found PythonInterp: /home/ubuntu/anaconda3/envs/ppunk/bin/python3 (found version "3.8.6")
-- Found PythonLibs: /home/ubuntu/anaconda3/envs/ppunk/lib/libpython3.8.so
-- Performing Test HAS_FLTO
-- Performing Test HAS_FLTO - Success
-- Found pybind11: /home/ubuntu/anaconda3/envs/ppunk/include (found version "2.6.1" )
-- Found Armadillo: /home/ubuntu/anaconda3/envs/ppunk/include (found version "9.900.4")
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE
-- Found OpenMP_C: -fopenmp (found version "4.0")
-- Found OpenMP_CXX: -fopenmp (found version "4.0")
-- Found OpenMP: TRUE (found version "4.0")
-- Looking for a CUDA compiler
-- Looking for a CUDA compiler - NOTFOUND
-- CUDA not found, compiling CPU code only
-- Configuring done
CMake Warning at CMakeLists.txt:51 (add_library):
  Cannot generate a safe runtime search path for target pp_sketchlib because
  files in some directories may conflict with libraries in implicit
  directories:

    runtime library [libgomp.so.1] in /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_4/lib may be hidden by files in:
      /home/ubuntu/anaconda3/envs/ppunk/lib

  Some of these libraries may not be found correctly.

-- Generating done
-- Build files have been written to: /home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8
/home/ubuntu/anaconda3/envs/ppunk/bin/cmake -S/home/ubuntu/pp-sketchlib -B/home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8 --check-build-system CMakeFiles/Makefile.cmake 0
/home/ubuntu/anaconda3/envs/ppunk/bin/cmake -E cmake_progress_start /home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8/CMakeFiles /home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8//CMakeFiles/progress.marks
/usr/bin/make  -f CMakeFiles/Makefile2 all
make[1]: Entering directory '/home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8'
/usr/bin/make  -f CMakeFiles/pp_sketchlib.dir/build.make CMakeFiles/pp_sketchlib.dir/depend
make[2]: Entering directory '/home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8'
cd /home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8 && /home/ubuntu/anaconda3/envs/ppunk/bin/cmake -E cmake_depends "Unix Makefiles" /home/ubuntu/pp-sketchlib /home/ubuntu/pp-sketchlib /home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8 /home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8 /home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8/CMakeFiles/pp_sketchlib.dir/DependInfo.cmake --color=
Dependee "/home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8/CMakeFiles/pp_sketchlib.dir/DependInfo.cmake" is newer than depender "/home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8/CMakeFiles/pp_sketchlib.dir/depend.internal".
Dependee "/home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8/CMakeFiles/CMakeDirectoryInformation.cmake" is newer than depender "/home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8/CMakeFiles/pp_sketchlib.dir/depend.internal".
Scanning dependencies of target pp_sketchlib
make[2]: Leaving directory '/home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8'
/usr/bin/make  -f CMakeFiles/pp_sketchlib.dir/build.make CMakeFiles/pp_sketchlib.dir/build
make[2]: Entering directory '/home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8'
[  7%] Building CXX object CMakeFiles/pp_sketchlib.dir/src/dist/dist.cpp.o
/home/linuxbrew/.linuxbrew/bin/c++ -DPYTHON_EXT -Dpp_sketchlib_EXPORTS -I/home/ubuntu/pp-sketchlib/src -isystem /home/ubuntu/anaconda3/envs/ppunk/include -isystem /home/ubuntu/anaconda3/envs/ppunk/include/python3.8 -isystem /home/ubuntu/anaconda3/envs/ppunk/include/eigen3 -DVERSION_INFO=\"1.5.4\" -march=native -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -O3 -DNDEBUG -flto -fno-fat-lto-objects -fPIC -fvisibility=hidden -fopenmp -std=gnu++14 -o CMakeFiles/pp_sketchlib.dir/src/dist/dist.cpp.o -c /home/ubuntu/pp-sketchlib/src/dist/dist.cpp
[ 15%] Building CXX object CMakeFiles/pp_sketchlib.dir/src/sketchlib_bindings.cpp.o
/home/linuxbrew/.linuxbrew/bin/c++ -DPYTHON_EXT -Dpp_sketchlib_EXPORTS -I/home/ubuntu/pp-sketchlib/src -isystem /home/ubuntu/anaconda3/envs/ppunk/include -isystem /home/ubuntu/anaconda3/envs/ppunk/include/python3.8 -isystem /home/ubuntu/anaconda3/envs/ppunk/include/eigen3 -DVERSION_INFO=\"1.5.4\" -march=native -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -O3 -DNDEBUG -flto -fno-fat-lto-objects -fPIC -fvisibility=hidden -fopenmp -std=gnu++14 -o CMakeFiles/pp_sketchlib.dir/src/sketchlib_bindings.cpp.o -c /home/ubuntu/pp-sketchlib/src/sketchlib_bindings.cpp
[ 23%] Building CXX object CMakeFiles/pp_sketchlib.dir/src/dist/matrix_ops.cpp.o
/home/linuxbrew/.linuxbrew/bin/c++ -DPYTHON_EXT -Dpp_sketchlib_EXPORTS -I/home/ubuntu/pp-sketchlib/src -isystem /home/ubuntu/anaconda3/envs/ppunk/include -isystem /home/ubuntu/anaconda3/envs/ppunk/include/python3.8 -isystem /home/ubuntu/anaconda3/envs/ppunk/include/eigen3 -DVERSION_INFO=\"1.5.4\" -march=native -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -O3 -DNDEBUG -flto -fno-fat-lto-objects -fPIC -fvisibility=hidden -fopenmp -std=gnu++14 -o CMakeFiles/pp_sketchlib.dir/src/dist/matrix_ops.cpp.o -c /home/ubuntu/pp-sketchlib/src/dist/matrix_ops.cpp
[ 30%] Building CXX object CMakeFiles/pp_sketchlib.dir/src/reference.cpp.o
/home/linuxbrew/.linuxbrew/bin/c++ -DPYTHON_EXT -Dpp_sketchlib_EXPORTS -I/home/ubuntu/pp-sketchlib/src -isystem /home/ubuntu/anaconda3/envs/ppunk/include -isystem /home/ubuntu/anaconda3/envs/ppunk/include/python3.8 -isystem /home/ubuntu/anaconda3/envs/ppunk/include/eigen3 -DVERSION_INFO=\"1.5.4\" -march=native -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -O3 -DNDEBUG -flto -fno-fat-lto-objects -fPIC -fvisibility=hidden -fopenmp -std=gnu++14 -o CMakeFiles/pp_sketchlib.dir/src/reference.cpp.o -c /home/ubuntu/pp-sketchlib/src/reference.cpp
[ 38%] Building CXX object CMakeFiles/pp_sketchlib.dir/src/sketch/seqio.cpp.o
/home/linuxbrew/.linuxbrew/bin/c++ -DPYTHON_EXT -Dpp_sketchlib_EXPORTS -I/home/ubuntu/pp-sketchlib/src -isystem /home/ubuntu/anaconda3/envs/ppunk/include -isystem /home/ubuntu/anaconda3/envs/ppunk/include/python3.8 -isystem /home/ubuntu/anaconda3/envs/ppunk/include/eigen3 -DVERSION_INFO=\"1.5.4\" -march=native -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -O3 -DNDEBUG -flto -fno-fat-lto-objects -fPIC -fvisibility=hidden -fopenmp -std=gnu++14 -o CMakeFiles/pp_sketchlib.dir/src/sketch/seqio.cpp.o -c /home/ubuntu/pp-sketchlib/src/sketch/seqio.cpp
[ 46%] Building CXX object CMakeFiles/pp_sketchlib.dir/src/sketch/countmin.cpp.o
/home/linuxbrew/.linuxbrew/bin/c++ -DPYTHON_EXT -Dpp_sketchlib_EXPORTS -I/home/ubuntu/pp-sketchlib/src -isystem /home/ubuntu/anaconda3/envs/ppunk/include -isystem /home/ubuntu/anaconda3/envs/ppunk/include/python3.8 -isystem /home/ubuntu/anaconda3/envs/ppunk/include/eigen3 -DVERSION_INFO=\"1.5.4\" -march=native -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -O3 -DNDEBUG -flto -fno-fat-lto-objects -fPIC -fvisibility=hidden -fopenmp -std=gnu++14 -o CMakeFiles/pp_sketchlib.dir/src/sketch/countmin.cpp.o -c /home/ubuntu/pp-sketchlib/src/sketch/countmin.cpp
[ 53%] Building CXX object CMakeFiles/pp_sketchlib.dir/src/sketch/sketch.cpp.o
/home/linuxbrew/.linuxbrew/bin/c++ -DPYTHON_EXT -Dpp_sketchlib_EXPORTS -I/home/ubuntu/pp-sketchlib/src -isystem /home/ubuntu/anaconda3/envs/ppunk/include -isystem /home/ubuntu/anaconda3/envs/ppunk/include/python3.8 -isystem /home/ubuntu/anaconda3/envs/ppunk/include/eigen3 -DVERSION_INFO=\"1.5.4\" -march=native -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -O3 -DNDEBUG -flto -fno-fat-lto-objects -fPIC -fvisibility=hidden -fopenmp -std=gnu++14 -o CMakeFiles/pp_sketchlib.dir/src/sketch/sketch.cpp.o -c /home/ubuntu/pp-sketchlib/src/sketch/sketch.cpp
[ 61%] Building CXX object CMakeFiles/pp_sketchlib.dir/src/database/database.cpp.o
/home/linuxbrew/.linuxbrew/bin/c++ -DPYTHON_EXT -Dpp_sketchlib_EXPORTS -I/home/ubuntu/pp-sketchlib/src -isystem /home/ubuntu/anaconda3/envs/ppunk/include -isystem /home/ubuntu/anaconda3/envs/ppunk/include/python3.8 -isystem /home/ubuntu/anaconda3/envs/ppunk/include/eigen3 -DVERSION_INFO=\"1.5.4\" -march=native -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -O3 -DNDEBUG -flto -fno-fat-lto-objects -fPIC -fvisibility=hidden -fopenmp -std=gnu++14 -o CMakeFiles/pp_sketchlib.dir/src/database/database.cpp.o -c /home/ubuntu/pp-sketchlib/src/database/database.cpp
[ 69%] Building CXX object CMakeFiles/pp_sketchlib.dir/src/api.cpp.o
/home/linuxbrew/.linuxbrew/bin/c++ -DPYTHON_EXT -Dpp_sketchlib_EXPORTS -I/home/ubuntu/pp-sketchlib/src -isystem /home/ubuntu/anaconda3/envs/ppunk/include -isystem /home/ubuntu/anaconda3/envs/ppunk/include/python3.8 -isystem /home/ubuntu/anaconda3/envs/ppunk/include/eigen3 -DVERSION_INFO=\"1.5.4\" -march=native -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -O3 -DNDEBUG -flto -fno-fat-lto-objects -fPIC -fvisibility=hidden -fopenmp -std=gnu++14 -o CMakeFiles/pp_sketchlib.dir/src/api.cpp.o -c /home/ubuntu/pp-sketchlib/src/api.cpp
[ 76%] Building CXX object CMakeFiles/pp_sketchlib.dir/src/dist/linear_regression.cpp.o
/home/linuxbrew/.linuxbrew/bin/c++ -DPYTHON_EXT -Dpp_sketchlib_EXPORTS -I/home/ubuntu/pp-sketchlib/src -isystem /home/ubuntu/anaconda3/envs/ppunk/include -isystem /home/ubuntu/anaconda3/envs/ppunk/include/python3.8 -isystem /home/ubuntu/anaconda3/envs/ppunk/include/eigen3 -DVERSION_INFO=\"1.5.4\" -march=native -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -O3 -DNDEBUG -flto -fno-fat-lto-objects -fPIC -fvisibility=hidden -fopenmp -std=gnu++14 -o CMakeFiles/pp_sketchlib.dir/src/dist/linear_regression.cpp.o -c /home/ubuntu/pp-sketchlib/src/dist/linear_regression.cpp
[ 84%] Building CXX object CMakeFiles/pp_sketchlib.dir/src/random/rng.cpp.o
/home/linuxbrew/.linuxbrew/bin/c++ -DPYTHON_EXT -Dpp_sketchlib_EXPORTS -I/home/ubuntu/pp-sketchlib/src -isystem /home/ubuntu/anaconda3/envs/ppunk/include -isystem /home/ubuntu/anaconda3/envs/ppunk/include/python3.8 -isystem /home/ubuntu/anaconda3/envs/ppunk/include/eigen3 -DVERSION_INFO=\"1.5.4\" -march=native -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -O3 -DNDEBUG -flto -fno-fat-lto-objects -fPIC -fvisibility=hidden -fopenmp -std=gnu++14 -o CMakeFiles/pp_sketchlib.dir/src/random/rng.cpp.o -c /home/ubuntu/pp-sketchlib/src/random/rng.cpp
In file included from /home/ubuntu/pp-sketchlib/src/random/rng.cpp:1:0:
/home/ubuntu/pp-sketchlib/src/random/rng.hpp:17:17: error: 'size_t' does not name a type
         typedef size_t result_type;
                 ^
/home/ubuntu/pp-sketchlib/src/random/rng.hpp:18:26: error: 'size_t' does not name a type
         static constexpr size_t min() { return std::numeric_limits<uint64_t>::min(); }
                          ^
/home/ubuntu/pp-sketchlib/src/random/rng.hpp:19:26: error: 'size_t' does not name a type
         static constexpr size_t max() { return std::numeric_limits<uint64_t>::max(); }
                          ^
CMakeFiles/pp_sketchlib.dir/build.make:214: recipe for target 'CMakeFiles/pp_sketchlib.dir/src/random/rng.cpp.o' failed
make[2]: *** [CMakeFiles/pp_sketchlib.dir/src/random/rng.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[2]: Leaving directory '/home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8'
CMakeFiles/Makefile2:97: recipe for target 'CMakeFiles/pp_sketchlib.dir/all' failed
make[1]: *** [CMakeFiles/pp_sketchlib.dir/all] Error 2
make[1]: Leaving directory '/home/ubuntu/pp-sketchlib/build/temp.linux-x86_64-3.8'
Makefile:105: recipe for target 'all' failed
make: *** [all] Error 2
Traceback (most recent call last):
  File "setup.py", line 100, in <module>
    setup(
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/setuptools/__init__.py", line 163, in setup
    return distutils.core.setup(**attrs)
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/setuptools/command/install.py", line 67, in run
    self.do_egg_install()
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/setuptools/command/install.py", line 109, in do_egg_install
    self.run_command('bdist_egg')
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/setuptools/command/bdist_egg.py", line 175, in run
    cmd = self.call_command('install_lib', warn_dir=0)
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/setuptools/command/bdist_egg.py", line 161, in call_command
    self.run_command(cmdname)
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/setuptools/command/install_lib.py", line 11, in run
    self.build()
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/distutils/command/install_lib.py", line 107, in build
    self.run_command('build_ext')
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "setup.py", line 54, in run
    self.build_extension(ext)
  File "setup.py", line 92, in build_extension
    subprocess.check_call(['cmake', '--build', '.'] + build_args, cwd=self.build_temp)
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--config', 'Release', '--', '-j2']' returned non-zero exit status 2.

This is all with the command: poppunk --easy-run --r-files reference_list.txt --output lm_example --threads 8 --plot-fit 5 --min-k 13 --full-db

Running this locally on my computer does not seem to give the same error on the other hand when following all the same steps - so I'm stumped for now on what the problem could be

johnlees commented 3 years ago

Hi, Thank you for the very detailed error report. As you note, this has been seen before on some machines, though for me it went away and I wasn't sure why. I think there are two possible culprits:

My feeling is it's the latter, but if we can get pp-sketchlib to compile for you locally we can rule that out. I had missed a header in the failing rng.hpp file which affects some systems, and gave the error above. I have now added this in on the master branch of pp-sketchlib. If you are able to, could you clone that and try installing again as you have done above? I hope that should work now.

If you still get the error I think it's likely graph-tool's fault, and I can look into that instead.

dmgie commented 3 years ago

No problem, thank you for making this tool.

I've cloned the master branch for pp-sketchlib and it completed fine with no error messages. I've tried running the same poppunk command as above but it still seems to be giving the exact same error of (Illegal instruction (core dumped)).

I've tried running the command poppunk_sketch --sketch --rfile reference_list.txt --ref-db listeria --sketch-size 156 --cpus 4 --min-k 15 --k-step 2 which results in the same Illegal instruction error. I hope that helps.

johnlees commented 3 years ago

Ok, does seem like a sketchlib issue then. Can you run poppunk_sketch --version, just to make sure that's picking up the copy you installed? It should show v1.5.4.

dmgie commented 3 years ago

Running poppunk_sketch --version oddly enough gives me poppunk_sketch 1.5.3. Though when looking at the output from python3 setup.py install for building pp-sketchlib does indeed state the correct version (1.5.4). Looking inside the file ~/anaconda3/envs/ppunk/bin/poppunk_sketch states version 1.5.4.

Edit: I uninstalled the old version using python3 -m pip uninstall pp-sketchlib and rebuilt it. It shows the the correct version now (1.5.4). I guess the old one was still being selected somehow over the one installed by running python3 setup.py install

johnlees commented 3 years ago

Great, thanks. And you are still getting the Illegal Instruction error?

Sorry if this is a pain, but would it be possible to try the following:

  1. Clean your sketchlib install.
  2. Reinstall with python3 setup.py install --debug (you should see -O0 and -g options in the compile lines generated by cmake if this worked)
  3. Run gdb -ex r --args python3 ~/anaconda3/envs/ppunk/bin/poppunk_sketch.py --sketch --rfile reference_list.txt --ref-db listeria --sketch-size 156 --cpus 4 --min-k 15 --k-step 2
  4. In the gdb debugger this should automatically run the program until the error (if it doesn't, type the command run)
  5. This should get to the error line. Type bt to get a backtrace. Post the results of this backtrace here.

I would greatly appreciate it if you could help with this, as I've been unable to replicate the problem myself!

dmgie commented 3 years ago

I just ran it again (after removing the old pp-sketchlib version), and the command poppunk_sketch --sketch --rfile reference_list.txt --ref-db listeria --sketch-size 156 --cpus 4 --min-k 15 --k-step 2 seems to be running fine and outputting a listeria.h5 file.

I tried running poppunk --easy-run --r-files reference_list.txt --output lm_example --threads 8 --plot-fit 5 --min-k 13 --full-db on the other hand results in a different error.

Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/ppunk/bin/poppunk", line 33, in <module>
    sys.exit(load_entry_point('poppunk==2.3.0', 'console_scripts', 'poppunk')())
  File "/home/ubuntu/anaconda3/envs/ppunk/bin/poppunk", line 25, in importlib_load_entry_point
    return next(matches).load()
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/importlib/metadata.py", line 77, in load
    module = import_module(match.group('module'))
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/PopPUNK/__main__.py", line 19, in <module>
    from .models import *
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/PopPUNK/models.py", line 22, in <module>
    from .plot import plot_scatter
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/PopPUNK/plot.py", line 28, in <module>
    from .utils import isolateNameToLabel
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/PopPUNK/utils.py", line 18, in <module>
    import h5py
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/__init__.py", line 33, in <module>
    from . import version
  File "/home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/version.py", line 15, in <module>
    from . import h5 as _h5
  File "h5py/h5.pyx", line 1, in init h5py.h5
ImportError: /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/defs.cpython-38-x86_64-linux-gnu.so: undefined symbol: H5Dget_num_chunks
johnlees commented 3 years ago

Ok, so perhaps there was an instruction in the conda version that was unsupported by your CPU. Good that the manual install resolved that – it may have to be the official work-around for this issue.

The new error looks like a problem with your h5py installation, the error is just coming from import h5py, not the PopPUNK code. Do you have libhdf5-dev installed on ubuntu? Or are you using the conda installation? Can you post the result of conda list?

dmgie commented 3 years ago

These are the results from conda list. Doing import h5py in a normal python shell seems to work okay. I tried creating a new environment just to see if it might fix something, but it is the same error.

# packages in environment at /home/ubuntu/anaconda3/envs/punk:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
armadillo                 9.900.4              h219c20c_0    conda-forge
arpack                    3.7.0                h236a147_2    conda-forge
at-spi2-atk               2.38.0               hdfca744_2    conda-forge
at-spi2-core              2.38.0               hdfca744_2    conda-forge
atk-1.0                   2.36.0               h0d5b62e_4    conda-forge
boost                     1.72.0           py38h1e42940_1    conda-forge
boost-cpp                 1.72.0               h9359b55_3    conda-forge
bzip2                     1.0.8                h516909a_3    conda-forge
c-ares                    1.11.0               h470a237_1    bioconda
ca-certificates           2020.11.8            ha878542_0    conda-forge
cached-property           1.5.1                      py_0    conda-forge
cairo                     1.16.0            h9f066cc_1006    conda-forge
cairomm                   1.12.2                        2    conda-forge
cairomm-1.0               1.12.2               h0069156_2    conda-forge
certifi                   2020.11.8        py38h578d9bd_0    conda-forge
cffi                      1.14.3           py38h1bdcb99_1    conda-forge
cmake                     3.18.4               h1f3970d_0    conda-forge
cycler                    0.10.0                     py_2    conda-forge
dbus                      1.13.6               hfdff14a_1    conda-forge
dendropy                  4.5.1              pyh3252c3a_0    bioconda
eigen                     3.3.8                h0efe328_0    conda-forge
epoxy                     1.5.4                h36c2ea0_4    conda-forge
expat                     2.2.9                he1b5a44_2    conda-forge
fontconfig                2.13.1            h7e3eb15_1002    conda-forge
freetype                  2.10.4               h7ca028e_0    conda-forge
fribidi                   1.0.10               h36c2ea0_0    conda-forge
gdk-pixbuf                2.42.0               h0536704_0    conda-forge
gettext                   0.19.8.1          hf34092f_1004    conda-forge
glib                      2.66.2               h58526e2_0    conda-forge
gmp                       6.2.1                h58526e2_0    conda-forge
gobject-introspection     1.66.1           py38h4eacb9c_3    conda-forge
graph-tool                2.35             py38hba68971_1    conda-forge
graphite2                 1.3.13            h58526e2_1001    conda-forge
gtk3                      3.24.23              h55fbbb0_2    conda-forge
h5py                      3.1.0           nompi_py38hafa665b_100    conda-forge
harfbuzz                  2.7.2                ha5b49bf_1    conda-forge
hdbscan                   0.8.26           py38h0b5ebd8_3    conda-forge
hdf5                      1.10.6          nompi_h1022a3e_1110    conda-forge
hicolor-icon-theme        0.17                 ha770c72_2    conda-forge
highfive                  2.2.2                h44f99b7_0    conda-forge
icu                       67.1                 he1b5a44_0    conda-forge
joblib                    0.17.0                     py_0    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
kiwisolver                1.3.1            py38h82cb98a_0    conda-forge
krb5                      1.17.2               h926e7f8_0    conda-forge
lcms2                     2.11                 hcbb858e_1    conda-forge
ld_impl_linux-64          2.35.1               hed1e6ac_0    conda-forge
libblas                   3.9.0                3_openblas    conda-forge
libcblas                  3.9.0                3_openblas    conda-forge
libcups                   2.2.12               hf10b501_1    conda-forge
libcurl                   7.71.1               hcdd3856_8    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.2.1             he1b5a44_1007    conda-forge
libgcc-ng                 9.3.0               h5dbcf3e_17    conda-forge
libgfortran-ng            9.3.0               he4bcb1c_17    conda-forge
libgfortran5              9.3.0               he4bcb1c_17    conda-forge
libglib                   2.66.2               hbe7bbb4_0    conda-forge
libgomp                   9.3.0               h5dbcf3e_17    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
liblapack                 3.9.0                3_openblas    conda-forge
libnghttp2                1.41.0               hab1572f_1    conda-forge
libopenblas               0.3.12          pthreads_h4812303_1    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
librsvg                   2.50.1               h33a7fed_0    conda-forge
libssh2                   1.9.0                hab1572f_5    conda-forge
libstdcxx-ng              9.3.0               h2ae2ef3_17    conda-forge
libtiff                   4.1.0                h4f3a223_6    conda-forge
libuuid                   2.32.1            h14c3975_1000    conda-forge
libuv                     1.40.0               hd18ef5c_0    conda-forge
libwebp-base              1.1.0                h36c2ea0_3    conda-forge
libxcb                    1.13              h14c3975_1002    conda-forge
libxml2                   2.9.10               h68273f3_2    conda-forge
lz4-c                     1.9.2                he1b5a44_3    conda-forge
matplotlib-base           3.3.3            py38h5c7f4ab_0    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
numpy                     1.19.4           py38hf0fd68c_1    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openblas                  0.3.12          pthreads_h04b7a96_1    conda-forge
openssl                   1.1.1h               h516909a_0    conda-forge
pandas                    1.1.4            py38h0ef3d22_0    conda-forge
pango                     1.42.4               h69149e4_5    conda-forge
pcre                      8.44                 he1b5a44_0    conda-forge
pillow                    8.0.1            py38h70fbd49_0    conda-forge
pip                       20.2.4                     py_0    conda-forge
pixman                    0.40.0               h36c2ea0_0    conda-forge
poppunk                   2.2.0                      py_0    bioconda
pp-sketchlib              1.5.4                    pypi_0    pypi
pthread-stubs             0.4               h14c3975_1001    conda-forge
pybind11                  2.6.1            py38h82cb98a_0    conda-forge
pybind11-global           2.6.1                    pypi_0    pypi
pycairo                   1.20.0           py38h323dad1_1    conda-forge
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pygobject                 3.38.0           py38hcdc0f24_2    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
python                    3.8.6           h852b56e_0_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python_abi                3.8                      1_cp38    conda-forge
pytz                      2020.4             pyhd8ed1ab_0    conda-forge
rapidnj                   2.3.2                hc9558a2_0    bioconda
readline                  8.0                  he28a2e2_2    conda-forge
rhash                     1.3.6             h516909a_1001    conda-forge
scikit-learn              0.23.2           py38h5d63f67_2    conda-forge
scipy                     1.5.3            py38hb2138dd_0    conda-forge
setuptools                49.6.0           py38h924ce5b_2    conda-forge
sigcpp-2.0                2.10.3               h58526e2_0    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
sparsehash                2.0.2                         0    bioconda
sqlite                    3.33.0               h4cf870e_1    conda-forge
superlu                   5.2.2                he1ec49c_0    conda-forge
threadpoolctl             2.1.0              pyh5ca1d4c_0    conda-forge
tk                        8.6.10               hed695b0_1    conda-forge
tornado                   6.1              py38h25fe258_0    conda-forge
wheel                     0.35.1             pyh9f0ad1d_0    conda-forge
xorg-compositeproto       0.4.2                         0    conda-forge
xorg-damageproto          1.2.1             h516909a_1002    conda-forge
xorg-fixesproto           5.0               h14c3975_1002    conda-forge
xorg-inputproto           2.3.2             h14c3975_1002    conda-forge
xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
xorg-libice               1.0.10               h516909a_0    conda-forge
xorg-libsm                1.2.3             h84519dc_1000    conda-forge
xorg-libx11               1.6.12               h516909a_0    conda-forge
xorg-libxau               1.0.9                h14c3975_0    conda-forge
xorg-libxaw               1.0.13            h516909a_1002    conda-forge
xorg-libxcomposite        0.4.5                h516909a_0    conda-forge
xorg-libxcursor           1.2.0                h516909a_0    conda-forge
xorg-libxdamage           1.1.5                h516909a_0    conda-forge
xorg-libxdmcp             1.1.3                h516909a_0    conda-forge
xorg-libxext              1.3.4                h516909a_0    conda-forge
xorg-libxfixes            5.0.3             h516909a_1004    conda-forge
xorg-libxi                1.7.10               h516909a_0    conda-forge
xorg-libxinerama          1.1.4             hf484d3e_1000    conda-forge
xorg-libxmu               1.1.3                h516909a_0    conda-forge
xorg-libxpm               3.5.13               h516909a_0    conda-forge
xorg-libxrandr            1.5.2                h516909a_1    conda-forge
xorg-libxrender           0.9.10            h516909a_1002    conda-forge
xorg-libxt                1.1.5             h516909a_1003    conda-forge
xorg-randrproto           1.5.0             h516909a_1001    conda-forge
xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
xorg-util-macros          1.19.2            h14c3975_1001    conda-forge
xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
xorg-xproto               7.0.31            h14c3975_1007    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
zstandard                 0.14.0           py38h950e882_3    conda-forge
zstd                      1.4.5                h6597ccf_2    conda-forge

Steps to reproduce (in case one of these steps might cause it to error out): conda create -n newenv python=3.8.6 cmake pybind11 hdf5 highfive eigen=3.3 armadillo poppunk (I had an error message about eigen when I initially tried to compile pp-sketchlib, and eigen=3.3 seemed to work) conda activate newenv python --version to make sure it's the correct python python3 -m pip uninstall pp-sketchlib to get rid of the currently installed version python3 setup.py install to install the new version


I just noticed that for python3 -m pip uninstall pp-sketchlib it does this:

Found existing installation: pp-sketchlib 1.5.3
Uninstalling pp-sketchlib-1.5.3:
  Would remove:
    /home/ubuntu/anaconda3/envs/punk/bin/poppunk_sketch
    /home/ubuntu/anaconda3/envs/punk/lib/python3.8/site-packages/pp_sketch/*
    /home/ubuntu/anaconda3/envs/punk/lib/python3.8/site-packages/pp_sketchlib-1.5.3.dist-info/*
    /home/ubuntu/anaconda3/envs/punk/lib/python3.8/site-packages/pp_sketchlib.cpython-38-x86_64-linux-gnu.so
Proceed (y/n)? y
  Successfully uninstalled pp-sketchlib-1.5.3

And here it removes the pp_sketchlib.cpython-38-x86_64-linux-gnu.so, which is maybe the reason there is an undefined symbol error?

johnlees commented 3 years ago

So if you run import h5py using the python from the same conda environment it's ok, but when the command is run from poppunk it gives the error?

For reference, I have:

h5py                      2.10.0          nompi_py38hfb01d0b_104    conda-forge
hdf5                      1.10.6          nompi_h3c11f04_101    conda-forge

It might first be instructive to run:

Another suggestion would be that perhaps you could try downgrading to this version of h5py with conda install h5py==2.10? But let's see the results of the above first

dmgie commented 3 years ago

Yep, the python in the conda environment has it, but when poppunk tries to run it it gives an error. I noticed that even when though I have h5py through python it did not show up in the conda list.

Output from ldd /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/defs.cpython-38-x86_64-linux-gnu.so

        linux-vdso.so.1 =>  (0x00007ffd21bf5000)
        libhdf5.so.103 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../../libhdf5.so.103 (0x00007f3a8eef4000)
        libhdf5_hl.so.100 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../../libhdf5_hl.so.100 (0x00007f3a8f42e000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f3a8ecd7000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3a8e90d000)
        libcrypto.so.1.1 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../.././libcrypto.so.1.1 (0x00007f3a8e641000)
        libcurl.so.4 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../.././libcurl.so.4 (0x00007f3a8f382000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f3a8e439000)
        libz.so.1 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../.././libz.so.1 (0x00007f3a8f368000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f3a8e235000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3a8df2c000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f3a8f27d000)
        libnghttp2.so.14 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../../././libnghttp2.so.14 (0x00007f3a8f33e000)
        libssh2.so.1 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../../././libssh2.so.1 (0x00007f3a8f2fa000)
        libssl.so.1.1 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../../././libssl.so.1.1 (0x00007f3a8de9c000)
        libgssapi_krb5.so.2 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../../././libgssapi_krb5.so.2 (0x00007f3a8f2aa000)
        libkrb5.so.3 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../../././libkrb5.so.3 (0x00007f3a8ddc3000)
        libk5crypto.so.3 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../../././libk5crypto.so.3 (0x00007f3a8dda4000)
        libcom_err.so.3 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../../././libcom_err.so.3 (0x00007f3a8f2a3000)
        libkrb5support.so.0 => /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/../../.././././libkrb5support.so.0 (0x00007f3a8dd95000)
        libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f3a8db79000)

And the output from nm -gDC /home/ubuntu/anaconda3/envs/ppunk/lib/python3.8/site-packages/h5py/defs.cpython-38-x86_64-linux-gnu.so

                 w __cxa_finalize
0000000000030bd4 T _fini
                 w __gmon_start__
                 U H5Aclose
                 U H5Acreate2
                 U H5Acreate_by_name
                 U H5Adelete
                 U H5Adelete_by_idx
                 U H5Adelete_by_name
                 U H5Aexists
                 U H5Aexists_by_name
                 U H5Aget_info
                 U H5Aget_info_by_idx
                 U H5Aget_info_by_name
                 U H5Aget_name
                 U H5Aget_num_attrs
                 U H5Aget_space
                 U H5Aget_storage_size
                 U H5Aget_type
                 U H5Aiterate2
                 U H5Aopen
                 U H5Aopen_by_idx
                 U H5Aopen_by_name
                 U H5Aopen_idx
                 U H5Aopen_name
                 U H5Aread
                 U H5Arename
                 U H5Arename_by_name
                 U H5Awrite
                 U H5check_version
                 U H5close
                 U H5Dclose
                 U H5Dcreate2
                 U H5Dcreate_anon
                 U H5Dextend
                 U H5Dfill
                 U H5Dflush
                 U H5Dget_access_plist
                 U H5Dget_chunk_info
                 U H5Dget_chunk_info_by_coord
                 U H5Dget_chunk_storage_size
                 U H5Dget_create_plist
                 U H5Dget_num_chunks
                 U H5Dget_offset
                 U H5Dget_space
                 U H5Dget_space_status
                 U H5Dget_storage_size
                 U H5Dget_type
                 U H5Diterate
                 U H5Dopen2
                 U H5DOread_chunk
                 U H5DOwrite_chunk
                 U H5Dread
                 U H5Dread_chunk
                 U H5Drefresh
                 U H5DSattach_scale
                 U H5DSdetach_scale
                 U H5Dset_extent
                 U H5DSget_label
                 U H5DSget_num_scales
                 U H5DSget_scale_name
                 U H5DSis_attached
                 U H5DSis_scale
                 U H5DSiterate_scales
                 U H5DSset_label
                 U H5DSset_scale
                 U H5Dvlen_get_buf_size
                 U H5Dvlen_reclaim
                 U H5Dwrite
                 U H5Fclose
                 U H5Fcreate
                 U H5FDregister
                 U H5FDunregister
                 U H5Fflush
                 U H5Fget_access_plist
                 U H5Fget_create_plist
                 U H5Fget_file_image
                 U H5Fget_filesize
                 U H5Fget_freespace
                 U H5Fget_intent
                 U H5Fget_mdc_config
                 U H5Fget_mdc_hit_rate
                 U H5Fget_mdc_size
                 U H5Fget_name
                 U H5Fget_obj_count
                 U H5Fget_obj_ids
                 U H5Fget_vfd_handle
                 U H5Fis_hdf5
                 U H5Fmount
                 U H5Fopen
                 U H5free_memory
                 U H5Freopen
                 U H5Freset_mdc_hit_rate_stats
                 U H5Fset_mdc_config
                 U H5Fstart_swmr_write
                 U H5Funmount
                 U H5Gclose
                 U H5Gcreate2
                 U H5Gcreate_anon
                 U H5get_libversion
                 U H5Gget_comment
                 U H5Gget_create_plist
                 U H5Gget_info
                 U H5Gget_info_by_name
                 U H5Gget_linkval
                 U H5Gget_num_objs
                 U H5Gget_objinfo
                 U H5Gget_objname_by_idx
                 U H5Gget_objtype_by_idx
                 U H5Giterate
                 U H5Glink2
                 U H5Gmove2
                 U H5Gopen2
                 U H5Gset_comment
                 U H5Gunlink
                 U H5Idec_ref
                 U H5Iget_file_id
                 U H5Iget_name
                 U H5Iget_ref
                 U H5Iget_type
                 U H5Iinc_ref
                 U H5Iis_valid
                 U H5Lcopy
                 U H5Lcreate_external
                 U H5Lcreate_hard
                 U H5Lcreate_soft
                 U H5Ldelete
                 U H5Ldelete_by_idx
                 U H5Lexists
                 U H5Lget_info
                 U H5Lget_info_by_idx
                 U H5Lget_name_by_idx
                 U H5Lget_val
                 U H5Lget_val_by_idx
                 U H5Literate
                 U H5Literate_by_name
                 U H5Lmove
                 U H5LTopen_file_image
                 U H5Lunpack_elink_val
                 U H5Lvisit
                 U H5Lvisit_by_name
                 U H5Oclose
                 U H5Ocopy
                 U H5Odecr_refcount
                 U H5Oexists_by_name
                 U H5Oget_comment
                 U H5Oget_comment_by_name
                 U H5Oget_info
                 U H5Oget_info_by_idx
                 U H5Oget_info_by_name
                 U H5Oincr_refcount
                 U H5Olink
                 U H5Oopen
                 U H5Oopen_by_addr
                 U H5Oopen_by_idx
                 U H5open
                 U H5Oset_comment
                 U H5Oset_comment_by_name
                 U H5Ovisit
                 U H5Ovisit_by_name
                 U H5Pall_filters_avail
                 U H5Pclose
                 U H5Pclose_class
                 U H5Pcopy
                 U H5Pcreate
                 U H5Pequal
                 U H5Pfill_value_defined
                 U H5Pget_alignment
                 U H5Pget_alloc_time
                 U H5Pget_attr_creation_order
                 U H5Pget_attr_phase_change
                 U H5Pget_cache
                 U H5Pget_char_encoding
                 U H5Pget_chunk
                 U H5Pget_chunk_cache
                 U H5Pget_class
                 U H5Pget_copy_object
                 U H5Pget_create_intermediate_group
                 U H5Pget_driver
                 U H5Pget_driver_info
                 U H5Pget_edc_check
                 U H5Pget_elink_fapl
                 U H5Pget_elink_prefix
                 U H5Pget_est_link_info
                 U H5Pget_external
                 U H5Pget_external_count
                 U H5Pget_family_offset
                 U H5Pget_fapl_core
                 U H5Pget_fapl_family
                 U H5Pget_fclose_degree
                 U H5Pget_file_space_strategy
                 U H5Pget_fill_time
                 U H5Pget_fill_value
                 U H5Pget_filter2
                 U H5Pget_filter_by_id2
                 U H5Pget_istore_k
                 U H5Pget_layout
                 U H5Pget_libver_bounds
                 U H5Pget_link_creation_order
                 U H5Pget_link_phase_change
                 U H5Pget_local_heap_size_hint
                 U H5Pget_mdc_config
                 U H5Pget_nfilters
                 U H5Pget_nlinks
                 U H5Pget_obj_track_times
                 U H5Pget_sieve_buf_size
                 U H5Pget_sizes
                 U H5Pget_sym_k
                 U H5Pget_userblock
                 U H5Pget_version
                 U H5Pget_virtual_count
                 U H5Pget_virtual_dsetname
                 U H5Pget_virtual_filename
                 U H5Pget_virtual_prefix
                 U H5Pget_virtual_printf_gap
                 U H5Pget_virtual_srcspace
                 U H5Pget_virtual_view
                 U H5Pget_virtual_vspace
                 U H5PLappend
                 U H5PLget
                 U H5PLinsert
                 U H5PLprepend
                 U H5PLremove
                 U H5PLreplace
                 U H5PLsize
                 U H5Pmodify_filter
                 U H5Premove_filter
                 U H5Pset_alignment
                 U H5Pset_alloc_time
                 U H5Pset_attr_creation_order
                 U H5Pset_attr_phase_change
                 U H5Pset_cache
                 U H5Pset_char_encoding
                 U H5Pset_chunk
                 U H5Pset_chunk_cache
                 U H5Pset_copy_object
                 U H5Pset_create_intermediate_group
                 U H5Pset_deflate
                 U H5Pset_driver
                 U H5Pset_edc_check
                 U H5Pset_elink_fapl
                 U H5Pset_elink_prefix
                 U H5Pset_est_link_info
                 U H5Pset_external
                 U H5Pset_family_offset
                 U H5Pset_fapl_core
                 U H5Pset_fapl_family
                 U H5Pset_fapl_log
                 U H5Pset_fapl_multi
                 U H5Pset_fapl_sec2
                 U H5Pset_fapl_split
                 U H5Pset_fapl_stdio
                 U H5Pset_fclose_degree
                 U H5Pset_file_image
                 U H5Pset_file_space_strategy
                 U H5Pset_fill_time
                 U H5Pset_fill_value
                 U H5Pset_filter
                 U H5Pset_fletcher32
                 U H5Pset_istore_k
                 U H5Pset_layout
                 U H5Pset_libver_bounds
                 U H5Pset_link_creation_order
                 U H5Pset_link_phase_change
                 U H5Pset_local_heap_size_hint
                 U H5Pset_mdc_config
                 U H5Pset_nlinks
                 U H5Pset_obj_track_times
                 U H5Pset_scaleoffset
                 U H5Pset_shuffle
                 U H5Pset_sieve_buf_size
                 U H5Pset_sizes
                 U H5Pset_sym_k
                 U H5Pset_szip
                 U H5Pset_userblock
                 U H5Pset_virtual
                 U H5Pset_virtual_prefix
                 U H5Pset_virtual_printf_gap
                 U H5Pset_virtual_view
                 U H5Rcreate
                 U H5Rdereference1
                 U H5Rget_name
                 U H5Rget_obj_type2
                 U H5Rget_region
                 U H5Sclose
                 U H5Scopy
                 U H5Screate
                 U H5Screate_simple
                 U H5Sdecode
                 U H5Sencode
                 U H5Sextent_copy
                 U H5Sget_regular_hyperslab
                 U H5Sget_select_bounds
                 U H5Sget_select_elem_npoints
                 U H5Sget_select_elem_pointlist
                 U H5Sget_select_hyper_blocklist
                 U H5Sget_select_hyper_nblocks
                 U H5Sget_select_npoints
                 U H5Sget_select_type
                 U H5Sget_simple_extent_dims
                 U H5Sget_simple_extent_ndims
                 U H5Sget_simple_extent_npoints
                 U H5Sget_simple_extent_type
                 U H5Sis_regular_hyperslab
                 U H5Sis_simple
                 U H5Soffset_simple
                 U H5Sselect_all
                 U H5Sselect_elements
                 U H5Sselect_hyperslab
                 U H5Sselect_none
                 U H5Sselect_valid
                 U H5Sset_extent_none
                 U H5Sset_extent_simple
                 U H5Tarray_create2
                 U H5Tclose
                 U H5Tcommit2
                 U H5Tcommitted
                 U H5Tconvert
                 U H5Tcopy
                 U H5Tcreate
                 U H5Tdecode
                 U H5Tdetect_class
                 U H5Tencode
                 U H5Tenum_create
                 U H5Tenum_insert
                 U H5Tenum_nameof
                 U H5Tenum_valueof
                 U H5Tequal
                 U H5Tfind
                 U H5Tget_array_dims2
                 U H5Tget_array_ndims
                 U H5Tget_class
                 U H5Tget_create_plist
                 U H5Tget_cset
                 U H5Tget_ebias
                 U H5Tget_fields
                 U H5Tget_inpad
                 U H5Tget_member_class
                 U H5Tget_member_index
                 U H5Tget_member_name
                 U H5Tget_member_offset
                 U H5Tget_member_type
                 U H5Tget_member_value
                 U H5Tget_native_type
                 U H5Tget_nmembers
                 U H5Tget_norm
                 U H5Tget_offset
                 U H5Tget_order
                 U H5Tget_pad
                 U H5Tget_precision
                 U H5Tget_sign
                 U H5Tget_size
                 U H5Tget_strpad
                 U H5Tget_super
                 U H5Tget_tag
                 U H5Tinsert
                 U H5Tis_variable_str
                 U H5Tlock
                 U H5Topen2
                 U H5Tpack
                 U H5Tregister
                 U H5Tset_cset
                 U H5Tset_ebias
                 U H5Tset_fields
                 U H5Tset_inpad
                 U H5Tset_norm
                 U H5Tset_offset
                 U H5Tset_order
                 U H5Tset_pad
                 U H5Tset_precision
                 U H5Tset_sign
                 U H5Tset_size
                 U H5Tset_strpad
                 U H5Tset_tag
                 U H5Tunregister
                 U H5Tvlen_create
                 U H5Zfilter_avail
                 U H5Zget_filter_info
                 U H5Zunregister
000000000000e000 T _init
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
                 U PyBytes_FromStringAndSize
                 U PyCapsule_GetName
                 U PyCapsule_GetPointer
                 U PyCapsule_IsValid
                 U PyCapsule_New
                 U _Py_CheckRecursionLimit
                 U _Py_CheckRecursiveCall
                 U PyCode_New
                 U _Py_Dealloc
                 U _PyDict_GetItem_KnownHash
                 U PyDict_GetItemString
                 U PyDict_New
                 U PyDict_SetItem
                 U PyDict_SetItemString
                 U PyErr_Clear
                 U PyErr_ExceptionMatches
                 U PyErr_Format
                 U PyErr_Occurred
                 U PyErr_SetObject
                 U PyErr_SetString
                 U PyErr_WarnEx
                 U PyEval_RestoreThread
                 U PyEval_SaveThread
                 U PyExc_AttributeError
                 U PyExc_ImportError
                 U PyExc_NameError
                 U PyExc_RuntimeError
                 U PyExc_SystemError
                 U PyExc_TypeError
                 U PyExc_ValueError
                 U _Py_FalseStruct
                 U PyFrame_New
                 U Py_GetVersion
                 U PyImport_AddModule
                 U PyImport_GetModuleDict
                 U PyImport_ImportModule
00000000000153bc T PyInit_defs
                 U PyInterpreterState_GetID
                 U PyMem_Malloc
                 U PyMem_Realloc
                 U PyModule_AddObject
                 U PyModuleDef_Init
                 U PyModule_GetDict
                 U PyModule_GetName
                 U PyModule_NewObject
                 U _Py_NoneStruct
                 U PyObject_Call
                 U PyObject_GetAttr
                 U PyObject_GetAttrString
                 U _PyObject_GetDictPtr
                 U PyObject_Hash
                 U PyObject_Not
                 U PyObject_SetAttr
                 U PyObject_SetAttrString
                 U PyOS_snprintf
                 U PyThreadState_Get
                 U _PyThreadState_UncheckedGet
                 U PyTraceBack_Here
                 U _Py_TrueStruct
                 U PyTuple_New
                 U PyTuple_Pack
                 U PyUnicode_Decode
                 U PyUnicode_FromFormat
                 U PyUnicode_FromString
                 U PyUnicode_FromStringAndSize
                 U PyUnicode_InternFromString
000000000004ae40 B __pyx_module_is_main_h5py(double,...)(float, short)
                 U __stack_chk_fail

Running conda install h5py ends up giving this:

conda install h5py
Collecting package metadata (current_repodata.json): done
Solving environment: -
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - bioconda/noarch::poppunk==2.2.0=py_0
done

## Package Plan ##

  environment location: /home/ubuntu/anaconda3/envs/punk

  added / updated specs:
    - h5py

The following NEW packages will be INSTALLED:

  pp-sketchlib       conda-forge/linux-64::pp-sketchlib-1.5.3-py38hffe8f08_0

which downgrades pp-sketchlib as well as changing the poppunk version for some reason - to 2.3.0. If I run it with --no-deps to prevent conda from installing the old pp-sketchlib, poppunk --easy-run --r-files reference_list.txt --output lm_example --threads 8 --plot-fit 5 --min-k 13 --full-db does not work anymore: poppunk: error: one of the arguments --create-db --fit-model --use-model is required

Running this instead then: poppunk --r-files reference_list.txt --output lm_example --threads 8 --plot-fit 5 --min-k 13 --create-db ends up giving me

Warning! ***HDF5 library version mismatched error***
The HDF5 header files used to compile this application do not match
the version used by the HDF5 library to which this application is linked.
Data corruption or segmentation faults may occur if the application continues.
This can happen when an application was compiled by one version of HDF5 but
linked with a different version of static or shared HDF5 library.
You should recompile the application or check your shared library related
settings such as 'LD_LIBRARY_PATH'.
You can, at your own risk, disable this warning by setting the environment
variable 'HDF5_DISABLE_VERSION_CHECK' to a value of '1'.
Setting it to 2 or higher will suppress the warning messages totally.
Headers are 1.10.4, library is 1.10.3

Where the installation point of hdf5 seems to in /home/linuxbrew/.linuxbrew/Cellar/hdf5/1.10.3 - which could be why it is causing the error. I'll try upgrading the hdf5 version of the system to see if it fixes things (I'm unable to remove it as some other tools installed by linuxbew require it as a dependency)

johnlees commented 3 years ago

A few notes on the above:

dmgie commented 3 years ago

So after running brew upgrade and having that go - I ran conda create -n pp_py38 poppunk==2.2.0 pp-sketchlib==1.5.3 as mentioned. Afterwards just to test out, I ran the poppunk --easy-run --r-files reference_list.txt --output lm_example --threads 8 --plot-fit 2 --min-k 13 --full-db command just to check if it still gives me the same error. Surprisingly it did not give me the Illegal Instruction error but instead:

Graph-tools OpenMP parallelisation enabled: with 8 threads
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
        (with backend: sketchlib v1.5.3
         sketchlib: /home/ubuntu/anaconda3/envs/pp_py38/lib/python3.8/site-packages/pp_sketchlib.cpython-38-x86_64-linux-gnu.so)
Mode: Creating clusters from assemblies (create_db & fit_model)
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/pp_py38/bin/poppunk", line 10, in <module>
    sys.exit(main())
  File "/home/ubuntu/anaconda3/envs/pp_py38/lib/python3.8/site-packages/PopPUNK/__main__.py", line 363, in main
    createDatabaseDir(args.output, kmers)
  File "/home/ubuntu/anaconda3/envs/pp_py38/lib/python3.8/site-packages/PopPUNK/sketchlib.py", line 82, in createDatabaseDir
    knum = ref_db['sketches/' + sample_name].attrs['kmers']
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/ubuntu/anaconda3/envs/pp_py38/lib/python3.8/site-packages/h5py/_hl/attrs.py", line 56, in __getitem__
    attr = h5a.open(self._id, self._e(name))
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5a.pyx", line 80, in h5py.h5a.open
KeyError: "Can't open attribute (can't locate attribute: 'kmers')"

This is all without messing with pp-sketchlib (as I had to do before). In conda list, both hdf5 and h5py are present and installed from conda-forge - with PopPUNK 2.2.0 and poppunk_sketch 1.5.3. echo $LD_LIBRARY_PATH is indeed empty

Sorry for the many issues!

johnlees commented 3 years ago

Ok, hopefully looks like the environment is sorted now!

Feels like we're getting there. What's in lm_example? Could it be a stale/corrupt .h5 file from before giving the error?

dmgie commented 3 years ago

I might have spoken too early. I was indeed a stale/corrupt lm_example.h5 file and after removing it, it went back to the Illegal Instruction error so I compiled pp-sketchlib from scratch again. I

Afterwards running the usual poppunk command I get:

Graph-tools OpenMP parallelisation enabled: with 8 threads
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
        (with backend: sketchlib v1.5.4
         sketchlib: /home/ubuntu/anaconda3/envs/pp_py38/lib/python3.8/site-packages/pp_sketchlib-1.5.4-py3.8-linux-x86_64.egg/pp_sketchlib.cpython-38-x86_64-linux-gnu.so)
Mode: Creating clusters from assemblies (create_db & fit_model)
Sketching 2 genomes using 2 thread(s)
Writing sketches to file
Problem processing h5 databases during QC - aborting
Unexpected error: <class 'KeyError'>
Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/pp_py38/bin/poppunk", line 10, in <module>
    sys.exit(main())
  File "/home/ubuntu/anaconda3/envs/pp_py38/lib/python3.8/site-packages/PopPUNK/__main__.py", line 364, in main
    seq_names = constructDatabase(
  File "/home/ubuntu/anaconda3/envs/pp_py38/lib/python3.8/site-packages/PopPUNK/sketchlib.py", line 365, in constructDatabase
    filtered_names = sketchlib_assembly_qc(oPrefix,
  File "/home/ubuntu/anaconda3/envs/pp_py38/lib/python3.8/site-packages/PopPUNK/utils.py", line 462, in sketchlib_assembly_qc
    seq_length[dataset] = hdf_in['sketches'][dataset].attrs['length']
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/ubuntu/anaconda3/envs/pp_py38/lib/python3.8/site-packages/h5py/_hl/attrs.py", line 56, in __getitem__
    attr = h5a.open(self._id, self._e(name))
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5a.pyx", line 80, in h5py.h5a.open
KeyError: "Can't open attribute (can't locate attribute: 'length')"

with a single file lm_example.h5 created in lm_example/. I tried using h5py==2.10 as well but I get the same error as above.


I just checked my own reference_list.txt file - and I forgot that I had modified it and removed some genomes so it wouldn't take as long to test it out. After redoing the paste <(ls *.fas) <(ls *.fas) > reference_list.txt it seems to be working running with no errors for now! Just a small question - would you have an idea of the estimated time it would take to run PopPUNK on ~300 Staph genomes?

johnlees commented 3 years ago

Ah, great! Sorry it was a bit of a long route to get it working!

300 Staph, if you're using a few cores, should really only be of the order of a few minutes. The main computation step is --create-db, all the model fitting is very fast

dmgie commented 3 years ago

Absolutely no problem. Thank you so much for helping throughout the whole ordeal! Ah alright - I've started running it 2 days ago (with 8 cores) and still seems to be on the step Trying to optimise score globally, which I'm guessing then might not be normal? The genomes being from ~5 different staph species.

johnlees commented 3 years ago

That's probably having trouble fitting the model in that case.

Can I suggest you start with two separate runs, one with --fit-model gmm --K 5 --output staph_gmm --ref-db <name> one with --fit-model dbscan --output staph_dbscan --ref-db <name> and look at the plots in the output, before trying a refined fit?

(if on poppunk 2.2.0 or lower, the first command becomes --fit-model --K 5 --output staph_gmm --ref-db <name> --distances <name/name.dists>, the second --fit-model --dbscan --output staph_dbscan --ref-db <name> --distances <name/name.dists>)

It might be worth having a look through the (long) tutorial on the docs to get an idea of how to deal with different fits, but feel free to post some of your output plots here and I can take a look

dmgie commented 3 years ago

Sorry for taking some time to go through it!

I've tried running it with --K 5 as well as some other --K values (there were a total of 7/8 species instead*). It works great and all of them seem to give me somewhat similar plots with some slight variation The results from --K 5 were below, with the other values giving similar amounts.

Network summary:
        Components      95
        Density 0.0501
        Transitivity    0.9941
        Score   0.9443

The DPGMM graph I got from --K 5 is below: staph_gmm_DPGMM_fit

And the db_scan: staph_dbscan_dbscan

From what I can tell the fit still needs to be improved (although the network summary does show good values?).

I've tried to go through the tutorial and tune some parameters to see if I can get a better fit, but due to not much experience in it, it has been a bit more difficult for me to understand.

Thank you for the all the help!

johnlees commented 3 years ago

Both of those fits look ok to me, the DBSCAN fit more specific, the GMM fit broader. I'd say the DBSCAN fit looks best, but I'd run with --microreact to check you're happy with your results on the tree.

You could also try running --refine-fit on your GMM model, which I'd expect to give something similar to the DBSCAN model.

johnlees commented 3 years ago

Closing due as appears to be solved. Please do open another issue, or email me, if you want to discuss your fitting further. The updated documentation may also be helpful: https://poppunk.readthedocs.io/en/latest/model_fitting.html