google / deepconsensus

DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.
BSD 3-Clause "New" or "Revised" License
222 stars 37 forks source link

Running deepconsensus results in "free(): invalid pointer" error #48

Closed mjg0 closed 1 year ago

mjg0 commented 1 year ago

I installed deepconsensus via pip in a virtualenv like this:

virtualenv /apps/deepconsensus/1.0.0/python-3.8.2_cpu
source /apps/deepconsensus/1.0.0/python-3.8.2_cpu/bin/activate
pip install pyyaml==5.4.1 'deepconsensus[cpu]==1.0.0'

I used pyyaml==5.4.1 since tf-models-official 2.10.0 requires pyyaml<6.0,>=5.1. I'm using Python 3.8.2.

When I run deepconsensus, even just for the help message, it fails--deepconsensus -h resulted in this error message:

2022-11-10 12:51:17.016037: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
*** Error in `/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python': free(): invalid pointer: 0x00007f075c296c80 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81329)[0x7f078b91d329]
/lib64/libstdc++.so.6(_ZNSt6locale5_Impl16_M_install_facetEPKNS_2idEPKNS_5facetE+0x142)[0x7f075c000192]
/lib64/libstdc++.so.6(_ZNSt6locale5_ImplC1Em+0x1e3)[0x7f075c0005e3]
/lib64/libstdc++.so.6(+0x71555)[0x7f075c001555]
/lib64/libpthread.so.0(+0x620b)[0x7f078c37920b]
/lib64/libstdc++.so.6(+0x715a1)[0x7f075c0015a1]
/lib64/libstdc++.so.6(_ZNSt6localeC2Ev+0x13)[0x7f075c0015e3]
/lib64/libstdc++.so.6(_ZNSt8ios_base4InitC2Ev+0xbc)[0x7f075bffe43c]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so(+0xb1150)[0x7f075bdd0150]
/lib64/ld-linux-x86-64.so.2(+0xf9c3)[0x7f078c7d59c3]
/lib64/ld-linux-x86-64.so.2(+0x1459e)[0x7f078c7da59e]
/lib64/ld-linux-x86-64.so.2(+0xf7d4)[0x7f078c7d57d4]
/lib64/ld-linux-x86-64.so.2(+0x13b8b)[0x7f078c7d9b8b]
/lib64/libdl.so.2(+0xfab)[0x7f078c16ffab]
/lib64/ld-linux-x86-64.so.2(+0xf7d4)[0x7f078c7d57d4]
/lib64/libdl.so.2(+0x15ad)[0x7f078c1705ad]
/lib64/libdl.so.2(dlopen+0x31)[0x7f078c170041]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyImport_FindSharedFuncptr+0x16b)[0x539abb]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyImport_LoadDynamicModuleWithSpec+0x159)[0x503e69]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x501a23]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x46f563]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyVectorcall_Call+0x5c)[0x439d8c]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x76d8)[0x42a308]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyFunction_Vectorcall+0x90)[0x438570]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x5f91)[0x428bc1]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x1fb5)[0x424be5]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x437f74]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyObject_CallMethodIdObjArgs+0xf1)[0x439831]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyImport_ImportModuleLevelObject+0x3fd)[0x502c8d]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x5ee426]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x437c24]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x76d8)[0x42a308]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyFunction_Vectorcall+0x90)[0x438570]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyFunction_Vectorcall+0x90)[0x438570]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x437f74]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyObject_CallMethodIdObjArgs+0xf1)[0x439831]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyImport_ImportModuleLevelObject+0x4e6)[0x502d76]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x6e78)[0x429aa8]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyEval_EvalCode+0x23)[0x4e1b43]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x5efe34]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python[0x46f563]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(PyVectorcall_Call+0x5c)[0x439d8c]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x76d8)[0x42a308]
/zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
======= Memory map: ========
00400000-006f3000 r-xp 00000000 00:31 37327427                           /zapps7/python/3.8.2/gcc-9.2.0/bin/python3.8
008f2000-008f3000 r--p 002f2000 00:31 37327427                           /zapps7/python/3.8.2/gcc-9.2.0/bin/python3.8
008f3000-0092b000 rw-p 002f3000 00:31 37327427                           /zapps7/python/3.8.2/gcc-9.2.0/bin/python3.8
0092b000-0094c000 rw-p 00000000 00:00 0 
01dd2000-03209000 rw-p 00000000 00:00 0                                  [heap]
7f0754000000-7f0754021000 rw-p 00000000 00:00 0 
7f0754021000-7f0758000000 ---p 00000000 00:00 0 
7f075bd1f000-7f075bdbc000 r--p 00000000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f075bdbc000-7f075bf0b000 r-xp 0009d000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f075bf0b000-7f075bf7f000 r--p 001ec000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f075bf7f000-7f075bf80000 ---p 00260000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f075bf80000-7f075bf85000 r--p 00260000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f075bf85000-7f075bf8f000 rw-p 00265000 00:31 77736144                   /zapps7/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.sozsh: abort      deepconsensus -h

Is there something different I should have done when installing?

akolesnikov commented 1 year ago

DeepConsensus PIP install all dependencies including TensorFlow. Could you please try in the clean environment without preinstalled TF?

mjg0 commented 1 year ago

This was in a clean virtualenv, and the Python that I used doesn't have TensorFlow--it was installed in the virtualenv by the same pip command that installed deepconsensus:

# The base python:
$ find /apps/python/3.8.2/gcc-9.2.0 -iname '*tensorflow*'
# The virtualenv:
$ find /apps/deepconsensus/1.0.0/python-3.8.2_cpu -iname '*tensorflow*' | head
/apps/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/tensorflow_addons
/apps/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/tensorflow_metadata
/apps/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/tensorflow_models
/apps/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/tensorflow_models/tensorflow_models_test.py
/apps/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/tensorflow_models/__pycache__/tensorflow_models_test.cpython-38.pyc
/apps/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/tensorflow_hub
/apps/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/tensorflow_hub-0.12.0.dist-info
/apps/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/tensorflow_io_gcs_filesystem
/apps/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/tensorflow_io_gcs_filesystem/core/python/ops/libtensorflow_io_gcs_filesystem.so
/apps/deepconsensus/1.0.0/python-3.8.2_cpu/lib/python3.8/site-packages/tensorflow_text-2.10.0.dist-info
akolesnikov commented 1 year ago

According to the stack trace the error is raised inside C++ STD. The only suggestion I have is to try it in the clean environment. One way could be to try the Quick Start on a Cloud Virtual Machine.

mjg0 commented 1 year ago

The fact that the backtrace went into libc.so.6 and libstdc++.so.6 is kind of a red herring, no? The ultimate error was in free(), but it would take a lot of convincing to make me believe that was due to a problem in free() itself rather than in the calling code, especially considering that nobody has reported similar problems with the hundreds of other pieces of software we've built with the same compiler.

A cloud virtual machine isn't feasible--I'm installing on our supercomputer to provide a lot of computing power for the requesting user, and running on a cloud VM would be a lot more expensive than our free offering. Installing via the Docker container and Charliecloud or Apptainer (or whichever secure-enough container platform) is an option, but I'd prefer not to make users jump through that kind of hoop if possible.

In any case, I tried with a separate compiler and freshly installed Python, with no packages other than those that come out-of-the-box--and a nearly identical error occurs:

$ python3 -m venv python-3.10.8_cpu
$ source ./python-3.10.8_cpu/bin/activate
$ python3 -m pip install 'deepconsensus[cpu]==1.0.0'
$ deepconsensus --help
*** Error in `/zapps7/deepconsensus/1.0.0/python-3.10.8_cpu/bin/python3': free(): invalid pointer: 0x00007f140b616c80 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81329)[0x7f143a5e7329]
/lib64/libstdc++.so.6(_ZNSt6locale5_Impl16_M_install_facetEPKNS_2idEPKNS_5facetE+0x142)[0x7f140b380192]
/lib64/libstdc++.so.6(_ZNSt6locale5_ImplC1Em+0x1e3)[0x7f140b3805e3]
/lib64/libstdc++.so.6(+0x71555)[0x7f140b381555]
/lib64/libpthread.so.0(+0x620b)[0x7f143b04320b]
/lib64/libstdc++.so.6(+0x715a1)[0x7f140b3815a1]
/lib64/libstdc++.so.6(_ZNSt6localeC2Ev+0x13)[0x7f140b3815e3]
/lib64/libstdc++.so.6(_ZNSt8ios_base4InitC2Ev+0xbc)[0x7f140b37e43c]
/zapps7/deepconsensus/1.0.0/python-3.10.8_cpu/lib/python3.10/site-packages/google/protobuf/pyext/_message.cpython-310-x86_64-linux-gnu.so(+0xb1150)[0x7f140b150150]
/lib64/ld-linux-x86-64.so.2(+0xf9c3)[0x7f143ba459c3]
/lib64/ld-linux-x86-64.so.2(+0x1459e)[0x7f143ba4a59e]
/lib64/ld-linux-x86-64.so.2(+0xf7d4)[0x7f143ba457d4]
/lib64/ld-linux-x86-64.so.2(+0x13b8b)[0x7f143ba49b8b]
/lib64/libdl.so.2(+0xfab)[0x7f143ae39fab]
/lib64/ld-linux-x86-64.so.2(+0xf7d4)[0x7f143ba457d4]
/lib64/libdl.so.2(+0x15ad)[0x7f143ae3a5ad]
/lib64/libdl.so.2(dlopen+0x31)[0x7f143ae3a041]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(+0x2276c8)[0x7f143b6b76c8]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(+0x226e58)[0x7f143b6b6e58]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(+0x151533)[0x7f143b5e1533]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x6fb8)[0x7f143b5d5f58]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x80)[0x7f143b5e1330]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x1407)[0x7f143b5d03a7]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x80)[0x7f143b5e1330]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x8aa)[0x7f143b5cf84a]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x80)[0x7f143b5e1330]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x3c1)[0x7f143b5cf361]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x80)[0x7f143b5e1330]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x3c1)[0x7f143b5cf361]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x80)[0x7f143b5e1330]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x3c1)[0x7f143b5cf361]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x80)[0x7f143b5e1330]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(+0x150ae0)[0x7f143b5e0ae0]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyObject_CallMethodIdObjArgs+0x145)[0x7f143b5efcf5]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(+0x75531)[0x7f143b505531]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(+0x16c364)[0x7f143b5fc364]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(+0x150e84)[0x7f143b5e0e84]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(PyObject_Call+0x1f8)[0x7f143b5eebe8]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x6fb8)[0x7f143b5d5f58]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x80)[0x7f143b5e1330]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x3c1)[0x7f143b5cf361]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyFunction_Vectorcall+0x80)[0x7f143b5e1330]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(+0x150ae0)[0x7f143b5e0ae0]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyObject_CallMethodIdObjArgs+0x145)[0x7f143b5efcf5]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(+0x75fcb)[0x7f143b505fcb]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x45cc)[0x7f143b5d356c]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(+0x1ee10a)[0x7f143b67e10a]
/zapps7/python/3.10.8/gcc-12.1.0/bin/../lib/libpython3.10.so.1.0(PyEval_EvalCode+0xzsh: abort      deepconsensus --help

That compiler is also well-vetted, with hundreds of software installations and no similar errors reported by any users.

pichuan commented 1 year ago

Thanks for reporting the error. @mjg0

I just tried pip install 'deepconsensus[cpu]==1.0.0' in virtualenv.

I found that pysam has newer versions which broke our pip in different ways from what you reported. So I'm going to continue my test but I'll need to change pysam>=0.19.0 to pysam==0.19.0 first. I'll report back later.

pichuan commented 1 year ago

Hi @mjg0 , While I wasn't able to reproduce the exact errors you've originally reported, I do think that our pip setup isn't ideal (and is currently broken) -- we should have more carefully pinned the versions of software we used!

In my test to pip install in virtualenv, I noticed that I needed to at least pin the following:

(github-test) pichuan@pichuan-cpu2:~/deepconsensus$ git diff
diff --git a/requirements.txt b/requirements.txt
index 5e7cf48..342180b 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -3,5 +3,5 @@ pandas>=1.1
 tf-models-official>=2.9.2
 ml_collections>=0.1.0
 absl-py>=0.13.0
-keras>=2.9
-pysam>=0.19.0
+keras==2.9
+pysam==0.19.0
diff --git a/setup.py b/setup.py
index 6ea20dd..670911b 100644
--- a/setup.py
+++ b/setup.py
@@ -42,8 +42,8 @@ long_description = (here / 'README_pip.md').read_text(encoding='utf-8')

 REQUIREMENTS = (here / 'requirements.txt').read_text().splitlines()
 EXTRA_REQUIREMENTS = {
-    'cpu': ['intel-tensorflow>=2.9.0'],
-    'gpu': ['tensorflow-gpu>=2.9.0']
+    'cpu': ['intel-tensorflow==2.9.1'],
+    'gpu': ['tensorflow-gpu==2.9.1']
 }

With this local change, I did pip install .[cpu].

With that, I'm seeing:

(github-test) pichuan@pichuan-cpu2:~/deepconsensus$ deepconsensus 
usage: DeepConsensus

Usage:
  deepconsensus <command> [optional arguments]

Commands:
  preprocess: Convert aligned subreads to tf.Example format.
  run: Run DeepConsenseus beginning with aligned subreads.
  calibrate: Calculate base-quality calibration.
deepconsensus: error: the following arguments are required: command

So, next steps for us:

  1. We'll need to update our GitHub repo so that we pin these versions properly.
  2. We'll need to update PyPI. Because we won't be able to reuse 1.0.0, we'll need to update the version there.

@mjg0 Let me know if you think this will likely solve your issue as well. If not, I'm happy to look further into your specific error.

pichuan commented 1 year ago

OK, one more update. Here is what my requirements.txt currently looks like. Not finalize yet because I'm still testing. But sharing in advance in case @mjg0 you want to test out on your end:

numpy==1.20.3
pandas==1.5.1
tf-models-official==2.9.2
pyyaml==5.4.1  # because of tf-models-official
ml_collections==0.1.1
absl-py==1.0.0
keras==2.9.0
pysam==0.19.0

(and setup.py should also pin:

    'cpu': ['intel-tensorflow==2.9.1'],
    'gpu': ['tensorflow-gpu==2.9.1']

)

mjg0 commented 1 year ago

I wasn't able to install with Python 3.10.8 using those pinned dependency versions because "pandas 1.5.1 depends on numpy>=1.21.0; python_version >= "3.10"," but succeeded with Python 3.8.2. Alas, though, the same free(): invalid pointer error persists when I run deepconsensus in the new virtualenv. I still just installed with pip, would it make a difference if I installed from source using another branch or something?

Which Python version are you using? I'm wondering if I can replicate your success with the same version, we have 3.6 through 3.11 available.

pichuan commented 1 year ago

@mjg0 I'm testing with Python 3.8.

(github-test) pichuan@pichuan-cpu2:~$ python --version
Python 3.8.10

You mentioned "pandas 1.5.1 depends on numpy>=1.21.0; python_version >= "3.10"" - that's interesting. I didn't seem to have any issues with using pandas==1.5.1 (didn't see any warning or error messages) but I can take a closer look.

You mentioned: "free(): invalid pointer error persists when I run deepconsensus in the new virtualenv. " --> This part is hard for me. I haven't been able to reproduce one yet.

Let me document step-by-step what I'm doing. Maybe that'll be helpful for you to spot which step we start becoming different.

(I haven't made a branch yet. But that's a good idea. Let me do that after I share with you my current steps first)

pichuan commented 1 year ago

My current steps that work for me

Created a machine:

gcloud compute instances create "${USER}-cpu" --scopes "compute-rw,storage-full,cloud-platform" --image-family "ubuntu-2004-lts" --image-project "ubuntu-os-cloud" --machine-type "custom-64-131072" --boot-disk-size "300" --zone "us-west2-b" --min-cpu-platform "Intel Skylake"

ssh into the machine:

gcloud compute ssh pichuan-cpu --zone us-west2-b

On the machine, install virtualenv and pip

sudo apt update -y && sudo apt install -y python3-virtualenv && sudo apt install -y python3-pip

Versions:

pichuan@pichuan-cpu:~$ uname -a
Linux pichuan-cpu 5.15.0-1021-gcp #28~20.04.1-Ubuntu SMP Mon Oct 17 11:37:54 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
pichuan@pichuan-cpu:~$ python3 --version
Python 3.8.10
pichuan@pichuan-cpu:~$ pip --version
pip 20.0.2 from /usr/lib/python3/dist-packages/pip (python 3.8)
pichuan@pichuan-cpu:~$ virtualenv --version
virtualenv 20.0.17 from /usr/lib/python3/dist-packages/virtualenv/__init__.py

Staring a virtualenv

virtualenv github-test
source github-test/bin/activate

Check versions again:

(github-test) pichuan@pichuan-cpu:~$ python --version
Python 3.8.10
(github-test) pichuan@pichuan-cpu:~$ pip --version
pip 20.0.2 from /home/pichuan/github-test/lib/python3.8/site-packages/pip (python 3.8)

Get the code:

git clone https://github.com/google/deepconsensus.git
cd deepconsensus

This just gets the current default, which is r1.0

Then, I manually updated requirements.txt and setup.py by basically just

cat > requirements.txt  <<- EOM
numpy==1.20.3
pandas==1.5.1
tf-models-official==2.9.2
pyyaml==5.4.1  # because of tf-models-official
ml_collections==0.1.1
absl-py==1.0.0
keras==2.9.0
pysam==0.19.0
EOM
sed -i -e "s/>=2.9.0/==2.9.1/g" setup.py

Here is the diff:

(github-test) pichuan@pichuan-cpu:~/deepconsensus$ git diff
diff --git a/requirements.txt b/requirements.txt
index 5e7cf48..e0accbb 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,7 +1,8 @@
-numpy>=1.19
-pandas>=1.1
-tf-models-official>=2.9.2
-ml_collections>=0.1.0
-absl-py>=0.13.0
-keras>=2.9
-pysam>=0.19.0
+numpy==1.20.3
+pandas==1.5.1
+tf-models-official==2.9.2
+pyyaml==5.4.1  # because of tf-models-official
+ml_collections==0.1.1
+absl-py==1.0.0
+keras==2.9.0
+pysam==0.19.0
diff --git a/setup.py b/setup.py
index 6ea20dd..670911b 100644
--- a/setup.py
+++ b/setup.py
@@ -42,8 +42,8 @@ long_description = (here / 'README_pip.md').read_text(encoding='utf-8')

 REQUIREMENTS = (here / 'requirements.txt').read_text().splitlines()
 EXTRA_REQUIREMENTS = {
-    'cpu': ['intel-tensorflow>=2.9.0'],
-    'gpu': ['tensorflow-gpu>=2.9.0']
+    'cpu': ['intel-tensorflow==2.9.1'],
+    'gpu': ['tensorflow-gpu==2.9.1']
 }

And then, I build from this modified space:

pip install .[cpu]

I looked at the log and didn't find any complaints about pandas:

Collecting pandas==1.5.1
  Downloading pandas-1.5.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB)

After the installing was done:

Check CLI worked:

(github-test) pichuan@pichuan-cpu:~/deepconsensus$ deepconsensus
usage: DeepConsensus

Usage:
  deepconsensus <command> [optional arguments]

Commands:
  preprocess: Convert aligned subreads to tf.Example format.
  run: Run DeepConsenseus beginning with aligned subreads.
  calibrate: Calculate base-quality calibration.
deepconsensus: error: the following arguments are required: command

deepconsensus run shows the full set of usage.

./run_all_tests.sh worked too.

(github-test) pichuan@pichuan-cpu:~/deepconsensus$ ./run_all_tests.sh 

This step takes a while, but will run a few more checks.

Tested on Quick Start

I ran https://github.com/google/deepconsensus/blob/r1.0/docs/quick_start.md , and for the deepconsensus run step I used the pip installed version to confirm it runs file.


@mjg0 let me know if you spot anything different from your steps.

pichuan commented 1 year ago

Ah, @mjg0 to your point about "pandas 1.5.1 depends on numpy>=1.21.0; python_version >= "3.10"" -- I think I can parse it now. I believe this means if Python version is 3.10, it will require numpy>=1.21.0?

That reminded me - maybe we should update our own Python version to something like:

    python_requires='>=3.6, <3.10',

And, I have not actually tested all the versions in this range specified. I wonder whether we should just pin this Python version to only the one we tested. But that also seems quite constrained. This is why we usually recommend the Docker solution. :-/

Oh btw, @mjg0 : have you considered using Singularity?

pichuan commented 1 year ago

Hi @mjg0 , we've updated our GitHub page and PyPI. The latest version on PyPI is:

pip install deepconsensus[cpu]==1.0.1

Please try that instead, and note that even though we didn't clearly limit the Python version for now, we've only tested with Python 3.8. We'll try to be better with pip requirements and setup in the future. If you have more suggestions (or some best practices you've seen on your side), please don't hesitate to let us know.

And, I'm still curious what you think about Singularity. Let me know.

mjg0 commented 1 year ago

@pichuan it now works without a hitch:

$ python3 -m venv dcvenv
$ source dcvenv/bin/activate
$ pip install --upgrade pip # our version is too old for intel-tensorflow==2.9.1
...
$ pip install deepconsensus[cpu]==1.0.1
...
$ deepconsensus
2022-11-30 11:18:28.441580: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2022-11-30 11:18:28.446546: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-11-30 11:18:28.446574: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
usage: DeepConsensus

Usage:
  deepconsensus <command> [optional arguments]

Commands:
  preprocess: Convert aligned subreads to tf.Example format.
  run: Run DeepConsenseus beginning with aligned subreads.
  calibrate: Calculate base-quality calibration.
deepconsensus: error: the following arguments are required: command

The GPU version works as well, and solving the linking issue with libcudart.so.11.0 is easy enough.

Before trying that, I got Charliecloud containers set up with deepconsensus 1.0.0 for CPUs and GPUs; both seemed to work, but I didn't run any actual workloads and haven't heard anything back from the user who requested them--maybe no news is good news? I'll also let him know that these virtualenvs are now available and see if he has any issues.

Thank you for getting the package versions that harmonize well pinned!

pichuan commented 1 year ago

Thanks for the updates @mjg0 ! Glad that it worked. I'll close this issue now. Feel free to open another one if you encounter any other issues.

mjg0 commented 1 year ago

This same issue has cropped up for me again with version 1.1.0:

$ python3 -m venv /zapps7/deepconsensus/1.1.0/python-3.8.2_cpu
...
$ source /zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/activate
$ pip install --upgrade pip
...
$ pip install deepconsensus[cpu]==1.1.0
...
$ ./deepconsensus -h
2023-01-27 16:32:20.852846: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
*** Error in `/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python': free(): invalid pointer: 0x00007f73f7ba7c80 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81329)[0x7f742894a329]
/lib64/libstdc++.so.6(_ZNSt6locale5_Impl16_M_install_facetEPKNS_2idEPKNS_5facetE+0x142)[0x7f73f7911192]
/lib64/libstdc++.so.6(_ZNSt6locale5_ImplC1Em+0x1e3)[0x7f73f79115e3]
/lib64/libstdc++.so.6(+0x71555)[0x7f73f7912555]
/lib64/libpthread.so.0(+0x620b)[0x7f74293a620b]
/lib64/libstdc++.so.6(+0x715a1)[0x7f73f79125a1]
/lib64/libstdc++.so.6(_ZNSt6localeC2Ev+0x13)[0x7f73f79125e3]
/lib64/libstdc++.so.6(_ZNSt8ios_base4InitC2Ev+0xbc)[0x7f73f790f43c]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so(+0xb1150)[0x7f73f76e1150]
/lib64/ld-linux-x86-64.so.2(+0xf9c3)[0x7f74298029c3]
/lib64/ld-linux-x86-64.so.2(+0x1459e)[0x7f742980759e]
/lib64/ld-linux-x86-64.so.2(+0xf7d4)[0x7f74298027d4]
/lib64/ld-linux-x86-64.so.2(+0x13b8b)[0x7f7429806b8b]
/lib64/libdl.so.2(+0xfab)[0x7f742919cfab]
/lib64/ld-linux-x86-64.so.2(+0xf7d4)[0x7f74298027d4]
/lib64/libdl.so.2(+0x15ad)[0x7f742919d5ad]
/lib64/libdl.so.2(dlopen+0x31)[0x7f742919d041]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyImport_FindSharedFuncptr+0x16b)[0x539abb]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyImport_LoadDynamicModuleWithSpec+0x159)[0x503e69]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x501a23]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x46f563]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(PyVectorcall_Call+0x5c)[0x439d8c]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x76d8)[0x42a308]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyFunction_Vectorcall+0x90)[0x438570]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x5f91)[0x428bc1]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x1fb5)[0x424be5]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x421571]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x437f74]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyObject_CallMethodIdObjArgs+0xf1)[0x439831]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(PyImport_ImportModuleLevelObject+0x3fd)[0x502c8d]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x5ee426]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x437c24]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x76d8)[0x42a308]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyFunction_Vectorcall+0x90)[0x438570]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x422821]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x15af)[0x4241df]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyFunction_Vectorcall+0x90)[0x438570]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x437f74]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyObject_CallMethodIdObjArgs+0xf1)[0x439831]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(PyImport_ImportModuleLevelObject+0x4e6)[0x502d76]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x6e78)[0x429aa8]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(PyEval_EvalCode+0x23)[0x4e1b43]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x5efe34]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python[0x46f563]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(PyVectorcall_Call+0x5c)[0x439d8c]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalFrameDefault+0x76d8)[0x42a308]
/zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/bin/python(_PyEval_EvalCodeWithName+0xadf)[0x4e171f]
======= Memory map: ========
00400000-006f3000 r-xp 00000000 00:32 37327427                           /zapps7/python/3.8.2/gcc-9.2.0/bin/python3.8
008f2000-008f3000 r--p 002f2000 00:32 37327427                           /zapps7/python/3.8.2/gcc-9.2.0/bin/python3.8
008f3000-0092b000 rw-p 002f3000 00:32 37327427                           /zapps7/python/3.8.2/gcc-9.2.0/bin/python3.8
0092b000-0094c000 rw-p 00000000 00:00 0 
00a61000-01e9b000 rw-p 00000000 00:00 0                                  [heap]
7f73f0000000-7f73f0021000 rw-p 00000000 00:00 0 
7f73f0021000-7f73f4000000 ---p 00000000 00:00 0 
7f73f7630000-7f73f76cd000 r--p 00000000 00:32 59849580                   /zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f73f76cd000-7f73f781c000 r-xp 0009d000 00:32 59849580                   /zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f73f781c000-7f73f7890000 r--p 001ec000 00:32 59849580                   /zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f73f7890000-7f73f7891000 ---p 00260000 00:32 59849580                   /zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f73f7891000-7f73f7896000 r--p 00260000 00:32 59849580                   /zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so
7f73f7896000-7f73f78a0000 rw-p 00265000 00:32 59849580                   /zapps7/deepconsensus/1.1.0/python-3.8.2_cpu/lib/python3.8/site-packages/google/protobuf/pyext/_message.cpython-38-x86_64-linux-gnu.so

I've messed with dependency versions but haven't found a combination that works.