Segmentation faults during isoseq3 cluster of many SMRT cells

ddpinto commented 4 years ago

Operating system Linux

Package name IsoSeq3 version 3.3.0

Describe the bug When running isoseq3 cluster with data from 8 2M sequel2 SMRT cells it consistently results in segmentation faults and core dumps. Running cells individually, or in e.g. sets of 4 works well, but combining them in larger sets results in crashes. At the time of the crashes the cluster job is using >400 Gb of RAM and the core dumps are of similar size.

Error message line 23: 14606 Segmentation fault (core dumped)

Conda environment

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                      1_llvm    conda-forge
avro-python3              1.9.0                    py37_0    bioconda
biopython                 1.76             py37h516909a_0    conda-forge
brotlipy                  0.7.0           py37h8f50634_1000    conda-forge
bzip2                     1.0.8                h516909a_2    conda-forge
ca-certificates           2020.4.5.1           hecc5488_0    conda-forge
certifi                   2020.4.5.1       py37hc8dfbb8_0    conda-forge
cffi                      1.14.0           py37hd463f26_0    conda-forge
chardet                   3.0.4           py37hc8dfbb8_1006    conda-forge
cryptography              2.9.2            py37hb09aad4_0    conda-forge
curl                      7.69.1               h33f0ec9_0    conda-forge
idna                      2.9                        py_1    conda-forge
iso8601                   0.1.12                   pypi_0    pypi
isoseq3                   3.3.0                         0    bioconda
krb5                      1.17.1               h2fd8d38_0    conda-forge
ld_impl_linux-64          2.34                 h53a641e_0    conda-forge
libblas                   3.8.0               16_openblas    conda-forge
libcblas                  3.8.0               16_openblas    conda-forge
libcurl                   7.69.1               hf7181ac_0    conda-forge
libdeflate                1.5                  h516909a_0    conda-forge
libedit                   3.1.20170329      hf8c457e_1001    conda-forge
libffi                    3.2.1             he1b5a44_1007    conda-forge
libgcc-ng                 9.2.0                h24d8f2e_2    conda-forge
libgfortran-ng            7.3.0                hdf63c60_5    conda-forge
liblapack                 3.8.0               16_openblas    conda-forge
libopenblas               0.3.9                h5ec1e0e_0    conda-forge
libssh2                   1.8.2                h22169c7_2    conda-forge
libstdcxx-ng              9.2.0                hdf63c60_2    conda-forge
lima                      1.11.0                        0    bioconda
llvm-openmp               10.0.0               hc9558a2_0    conda-forge
ncurses                   6.1               hf484d3e_1002    conda-forge
numpy                     1.18.1           py37h8960a57_1    conda-forge
openssl                   1.1.1g               h516909a_0    conda-forge
pbccs                     4.2.0                         1    bioconda
pbcommand                 2.0.1                    pypi_0    pypi
pbcore                    2.0.8                    pypi_0    pypi
pbcoretools               0.7.5                    pypi_0    pypi
pip                       20.1               pyh9f0ad1d_0    conda-forge
pycparser                 2.20                       py_0    conda-forge
pyopenssl                 19.1.0                     py_1    conda-forge
pysam                     0.15.4           py37hbcae180_0    bioconda
pysocks                   1.7.1            py37hc8dfbb8_1    conda-forge
python                    3.7.6           h8356626_5_cpython    conda-forge
python_abi                3.7                     1_cp37m    conda-forge
pytz                      2020.1             pyh9f0ad1d_0    conda-forge
readline                  8.0                  hf8c457e_0    conda-forge
requests                  2.23.0             pyh8c360ce_2    conda-forge
setuptools                46.1.3           py37hc8dfbb8_0    conda-forge
six                       1.14.0                     py_1    conda-forge
sqlite                    3.30.1               hcee41ef_0    conda-forge
tk                        8.6.10               hed695b0_0    conda-forge
urllib3                   1.25.9                     py_0    conda-forge
wheel                     0.34.2                     py_1    conda-forge
xz                        5.2.5                h516909a_0    conda-forge
zlib                      1.2.11            h516909a_1006    conda-forge

armintoepfer commented 4 years ago

1) Is your sample targeted?

2) Running 8x 8M cells is really heavy.

3) Are you starting with HiFi or unpolished CCS? Maybe try running CCS with --min-passes 2 --min-rq 0.9 and then feed that into the Iso-Seq pipeline.

ddpinto commented 4 years ago

Samples are not targeted. I am starting from HiFi reads and running CCS with --min-passes 3 --min-rq 0.9. I could share datafiles or core dump (though it is very big) for troubleshooting.

armintoepfer commented 4 years ago

I suspect that you simply run out of memory. We never ran more than 3-4 8M cells at once, so far. The only thing you can do for now, is use a high-memory compute server. Reducing the memory footprint is not an easy task and so far you are the first one to report issues. If that solution leaves you unhappy, feel free to report an issue with PacBio's tech support, but even if we were to implement changes, it won't be near term.

ddpinto commented 4 years ago

I am already running on a 1 Tb node but when the crash occurs the process is using "only" 400 Gb.

PacificBiosciences / pbbioconda

Segmentation faults during isoseq3 cluster of many SMRT cells #310