segmentation fault - Githubissues

nvpatin commented 2 years ago

I am receiving a "segmentation fault" error when I try to run DEICODE auto-rpca. I've tried running it both in QIIME2 and standalone, with the standalone as the most recent version installed with conda ("conda install -c conda-forge deicode") both locally and on an HPC system, with the same outcome. My full command is

deicode auto-rpca --in-biom metaflye_hybrid_dfs.biom --output-dir metaflye-hybrid-deicode

And the error message is:

/var/spool/slurmd/job116796/slurm_script: line 22: 82401 Segmentation fault

It's not very informative. On the HPC I tried increasing the memory allocation to 500GB (10 nodes) with no success. I attached the tab-separated text version of my BIOM file here in case that is helpful.

Any suggestions are greatly appreciated.

metaflye_hybrid_dfs.txt

mortonjt commented 2 years ago

hmm weird -- I don't think it is a memory issue. More likely there is a software dependency issue. Could you provide the qiime2 version and the conda environment? You can display the output of conda env export

On Fri, Sep 30, 2022 at 7:01 PM Nastassia Patin @.***> wrote:

I am receiving a "segmentation fault" error when I try to run DEICODE auto-rpca. I've tried running it both in QIIME2 and standalone, with the standalone as the most recent version installed with conda ("conda install -c conda-forge deicode") both locally and on an HPC system, with the same outcome. My full command is

deicode auto-rpca --in-biom metaflye_hybrid_dfs.biom --output-dir metaflye-hybrid-deicode

And the error message is:

/var/spool/slurmd/job116796/slurm_script: line 22: 82401 Segmentation fault

It's not very informative. On the HPC I tried increasing the memory allocation to 500GB (10 nodes) with no success. I attached the tab-separated text version of my BIOM file here in case that is helpful.

Any suggestions are greatly appreciated.

metaflye_hybrid_dfs.txt https://github.com/biocore/DEICODE/files/9688049/metaflye_hybrid_dfs.txt

— Reply to this email directly, view it on GitHub https://github.com/biocore/DEICODE/issues/65, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXMTUWKJZVAYEU2ZQ33WA5WLXANCNFSM6AAAAAAQ2FOLKQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

nvpatin commented 2 years ago

Thanks for the quick response! Here is the output of 'conda env export': name: qiime2-2022.2 channels:

qiime2/label/r2022.2
conda-forge
bioconda
defaults dependencies:
_r-mutex=1.0.1=anacondar_1
aioeasywebdav=2.4.0=py38h50d1736_1001
aiohttp=3.8.1=py38hed1de0f_1
aiosignal=1.2.0=pyhd8ed1ab_0
amply=0.1.5=pyhd8ed1ab_0
appdirs=1.4.4=pyh9f0ad1d_0
appnope=0.1.3=pyhd8ed1ab_0
argon2-cffi=21.3.0=pyhd8ed1ab_0
argon2-cffi-bindings=21.2.0=py38hed1de0f_2
argtable2=2.13=h1de35cc_1001
asttokens=2.0.5=pyhd8ed1ab_0
async-timeout=4.0.2=pyhd8ed1ab_0
attmap=0.13.2=pyhd8ed1ab_0
attrs=21.4.0=pyhd8ed1ab_0
backcall=0.2.0=pyh9f0ad1d_0
backports=1.0=py_2
backports.functools_lru_cache=1.6.4=pyhd8ed1ab_0
bcrypt=3.2.2=py38h0dd4459_0
beautifulsoup4=4.11.1=pyha770c72_0
bibtexparser=1.1.0=py_0
bioconductor-biobase=2.54.0=r41haba8685_1
bioconductor-biocgenerics=0.40.0=r41hdfd78af_0
bioconductor-biocparallel=1.28.3=r41h7cba510_0
bioconductor-biostrings=2.62.0=r41haba8685_1
bioconductor-dada2=1.22.0=r41h7cba510_1
bioconductor-delayedarray=0.20.0=r41haba8685_1
bioconductor-genomeinfodb=1.30.0=r41hdfd78af_0
bioconductor-genomeinfodbdata=1.2.7=r41hdfd78af_1
bioconductor-genomicalignments=1.30.0=r41haba8685_1
bioconductor-genomicranges=1.46.1=r41haba8685_0
bioconductor-iranges=2.28.0=r41haba8685_1
bioconductor-matrixgenerics=1.6.0=r41hdfd78af_0
bioconductor-rhtslib=1.26.0=r41haba8685_1
bioconductor-rsamtools=2.10.0=r41h7cba510_1
bioconductor-s4vectors=0.32.3=r41haba8685_0
bioconductor-shortread=1.52.0=r41h7cba510_1
bioconductor-summarizedexperiment=1.24.0=r41hdfd78af_0
bioconductor-xvector=0.34.0=r41haba8685_1
bioconductor-zlibbioc=1.40.0=r41haba8685_1
biom-format=2.1.10=py38hbe852b5_1
biopython=1.79=py38hed1de0f_2
blast=2.12.0=h0370960_3
bleach=5.0.0=pyhd8ed1ab_0
bokeh=2.4.3=py38h50d1736_0
boto3=1.24.18=pyhd8ed1ab_0
botocore=1.27.19=pyhd8ed1ab_0
bowtie2=2.4.5=py38h85b03c0_2
brotli=1.0.9=h5eb16cf_7
brotli-bin=1.0.9=h5eb16cf_7
brotlipy=0.7.0=py38hed1de0f_1004
bwidget=1.9.14=h694c41f_1
bzip2=1.0.8=h0d85af4_4
c-ares=1.18.1=h0d85af4_0
ca-certificates=2022.6.15=h033912b_0
cachecontrol=0.12.11=pyhd8ed1ab_0
cached-property=1.5.2=hd8ed1ab_1
cached_property=1.5.2=pyha770c72_1
cachetools=5.0.0=pyhd8ed1ab_0
cairo=1.16.0=h1680b09_1011
cctools_osx-64=973.0.1=h3eff9a4_10
certifi=2022.6.15=py38h50d1736_0
cffi=1.15.0=py38h1a44b6c_0
charset-normalizer=2.0.12=pyhd8ed1ab_0
clang=14.0.4=h694c41f_0
clang-14=14.0.4=default_h55ffa42_0
clang_osx-64=14.0.4=h3a95cd4_2
clangxx=14.0.4=default_h55ffa42_0
clangxx_osx-64=14.0.4=he1dbc44_2
click=7.1.2=pyh9f0ad1d_0
clustalo=1.2.4=h9722bc1_5
coin-or-cbc=2.10.8=hc8a182d_0
coin-or-cgl=0.60.6=hdb64514_0
coin-or-clp=1.17.6=hc022024_3
coin-or-osi=0.108.7=h009c923_0
coin-or-utils=2.11.6=h07ff368_0
coincbc=2.10.8=0_metapackage
colorama=0.4.4=pyh9f0ad1d_0
compiler-rt=14.0.4=h7fcd477_0
compiler-rt_osx-64=14.0.4=h6df654d_0
configargparse=1.5.3=pyhd8ed1ab_0
connection_pool=0.0.3=pyhd3deb0d_0
cryptography=37.0.1=py38hf6deb26_0
curl=7.83.1=h23f1065_0
cutadapt=4.0=py38h431a6f7_0
cycler=0.11.0=pyhd8ed1ab_0
cython=0.29.30=py38h1c67a95_0
datrie=0.8.2=py38h96a0964_3
deblur=1.1.0=py_2
debugpy=1.6.0=py38h038c8f4_0
decorator=4.4.2=py_0
defusedxml=0.7.1=pyhd8ed1ab_0
deicode=0.2.4=py38h50d1736_1
dendropy=4.5.2=pyh3252c3a_0
dill=0.3.5.1=pyhd8ed1ab_0
dnaio=0.9.0=py38h431a6f7_0
docutils=0.18.1=py38h50d1736_1
dpath=2.0.6=py38h50d1736_1
dropbox=11.32.0=pyhd8ed1ab_0
emperor=1.0.3=py38h50d1736_0
entrez-direct=16.2=h193322a_0
entrypoints=0.4=pyhd8ed1ab_0
executing=0.8.3=pyhd8ed1ab_0
expat=2.4.8=h96cf925_0
fastcluster=1.2.6=py38hb872667_1
fasttree=2.1.10=0
filechunkio=1.8=py_2
filelock=3.7.1=pyhd8ed1ab_0
flit-core=3.7.1=pyhd8ed1ab_0
font-ttf-dejavu-sans-mono=2.37=hab24e00_0
font-ttf-inconsolata=3.000=h77eed37_0
font-ttf-source-code-pro=2.038=h77eed37_0
font-ttf-ubuntu=0.83=hab24e00_0
fontconfig=2.14.0=h676cef8_0
fonts-conda-ecosystem=1=0
fonts-conda-forge=1=0
fonttools=4.33.3=py38h0dd4459_0
freetype=2.10.4=h4cff582_1
fribidi=1.0.10=hbcb3906_0
frozenlist=1.3.0=py38hed1de0f_1
ftputil=5.0.4=pyhd8ed1ab_0
future=0.18.2=py38h50d1736_5
gettext=0.19.8.1=hd1a6beb_1008
gfortran_impl_osx-64=9.3.0=h9cc0e5e_23
gfortran_osx-64=9.3.0=h18f7dce_15
giflib=5.2.1=hbcb3906_2
gitdb=4.0.9=pyhd8ed1ab_0
gitpython=3.1.27=pyhd8ed1ab_0
gmp=6.2.1=h2e338ed_0
gneiss=0.4.6=py_0
google-api-core=2.8.2=pyhd8ed1ab_0
google-api-python-client=2.51.0=pyhd8ed1ab_0
google-auth=2.8.0=pyh6c4a22f_0
google-auth-httplib2=0.1.0=pyhd8ed1ab_1
google-cloud-core=2.3.1=pyhd8ed1ab_0
google-cloud-storage=2.4.0=pyh6c4a22f_0
google-crc32c=1.1.2=py38h5f3b482_3
google-resumable-media=2.3.3=pyhd8ed1ab_0
googleapis-common-protos=1.56.3=py38h50d1736_0
graphite2=1.3.13=h2e338ed_1001
grpcio=1.46.3=py38h0fbd9b5_0
gsl=2.7=h93259b0_0
h5py=3.6.0=nompi_py38h9f21798_100
harfbuzz=4.3.0=h00bb2c2_0
hdf5=1.12.1=nompi_h0aa1fa2_104
hdmedians=0.14.2=py38hbe852b5_1
hmmer=3.1b2=3
htslib=1.15.1=hc057d7f_0
httplib2=0.20.4=pyhd8ed1ab_0
icu=70.1=h96cf925_0
idna=3.3=pyhd8ed1ab_0
ijson=3.1.3=pyhd3deb0d_0
importlib-metadata=4.11.4=py38h50d1736_0
importlib_resources=5.7.1=pyhd8ed1ab_1
iniconfig=1.1.1=pyh9f0ad1d_0
ipykernel=6.13.0=py38h60dac5d_0
ipython=8.4.0=py38h50d1736_0
ipython_genutils=0.2.0=py_1
ipywidgets=7.7.0=pyhd8ed1ab_0
iqtree=2.2.0_beta=h135ad0d_1
isa-l=2.30.0=h0d85af4_4
isl=0.22.1=hb1e8313_2
jedi=0.18.1=py38h50d1736_1
jinja2=3.1.2=pyhd8ed1ab_0
jmespath=1.0.1=pyhd8ed1ab_0
joblib=1.1.0=pyhd8ed1ab_0
jpeg=9e=h5eb16cf_1
jsonschema=4.5.1=pyhd8ed1ab_0
jupyter_client=7.3.1=pyhd8ed1ab_0
jupyter_core=4.10.0=py38h50d1736_0
jupyterlab_pygments=0.2.2=pyhd8ed1ab_0
jupyterlab_widgets=1.1.0=pyhd8ed1ab_0
kiwisolver=1.4.2=py38h8b7791e_1
krb5=1.19.3=hb98e516_0
lcms2=2.12=h577c468_0
ld64_osx-64=609=h1e06c2b_10
lerc=3.0=he49afe7_0
libblas=3.9.0=14_osx64_openblas
libbrotlicommon=1.0.9=h5eb16cf_7
libbrotlidec=1.0.9=h5eb16cf_7
libbrotlienc=1.0.9=h5eb16cf_7
libcblas=3.9.0=14_osx64_openblas
libclang-cpp14=14.0.4=default_h55ffa42_0
libcrc32c=1.1.2=he49afe7_0
libcurl=7.83.1=h23f1065_0
libcxx=14.0.4=hc203e6f_0
libdeflate=1.10=h0d85af4_0
libedit=3.1.20191231=h0678c8f_2
libev=4.33=haf1e3a3_1
libffi=3.4.2=h0d85af4_5
libgcc=4.8.5=1
libgfortran=5.0.0=9_3_0_h6c81a4c_23
libgfortran-devel_osx-64=9.3.0=h6c81a4c_23
libgfortran5=9.3.0=h6c81a4c_23
libglib=2.70.2=hf1fb8c0_4
libiconv=1.16=haf1e3a3_0
liblapack=3.9.0=14_osx64_openblas
liblapacke=3.9.0=14_osx64_openblas
libllvm10=10.0.1=h009f743_3
libllvm13=13.0.1=h64f94b2_2
libllvm14=14.0.4=h41df66c_0
libnghttp2=1.47.0=hca56917_0
libopenblas=0.3.20=openmp_hb3cd9ec_0
libpng=1.6.37=h7cec526_2
libprotobuf=3.20.1=h2292cb8_0
libsodium=1.0.18=hbcb3906_1
libssh2=1.10.0=hd3787cc_2
libtiff=4.3.0=hfca7e8f_4
libwebp=1.2.2=h28dabe5_0
libwebp-base=1.2.2=h0d85af4_1
libxcb=1.13=h0d85af4_1004
libxml2=2.9.14=h08a9926_0
libxslt=1.1.33=h5bff336_4
libzlib=1.2.11=h6c3fc93_1014
llvm-openmp=14.0.4=ha654fa7_0
llvm-tools=14.0.4=h41df66c_0
llvmlite=0.36.0=py38h872f124_0
lockfile=0.12.2=py_1
logmuse=0.2.6=pyh8c360ce_0
lxml=4.8.0=py38hed1de0f_3
lz4=4.0.0=py38h5cd37e2_2
lz4-c=1.9.3=he49afe7_1
mafft=7.505=ha5712d3_0
make=4.3=h22f3db7_1
markupsafe=2.1.1=py38hed1de0f_1
matplotlib=3.5.2=py38h50d1736_0
matplotlib-base=3.5.2=py38h1b6b9d1_0
matplotlib-inline=0.1.3=pyhd8ed1ab_0
mistune=0.8.4=py38h96a0964_1005
mpc=1.2.1=hbb51d92_0
mpfr=4.1.0=h0f52abe_1
msgpack-python=1.0.3=py38h8b7791e_1
multidict=6.0.2=py38hed1de0f_1
munkres=1.1.4=pyh9f0ad1d_0
muscle=5.1=hb339e23_1
natsort=8.1.0=pyhd8ed1ab_0
nbclient=0.6.3=pyhd8ed1ab_0
nbconvert=6.5.0=pyhd8ed1ab_0
nbconvert-core=6.5.0=pyhd8ed1ab_0
nbconvert-pandoc=6.5.0=pyhd8ed1ab_0
nbformat=5.4.0=pyhd8ed1ab_0
ncurses=6.3=h96cf925_1
nest-asyncio=1.5.5=pyhd8ed1ab_0
networkx=2.8.2=pyhd8ed1ab_0
nose=1.3.7=py_1006
notebook=6.4.11=pyha770c72_0
numba=0.53.1=py38h5b9a75a_1
numpy=1.22.4=py38h3ad0702_0
oauth2client=4.1.3=py_0
openjdk=11.0.9.1=h2292cb8_3
openjpeg=2.4.0=h6e7aa92_1
openssl=3.0.4=hfe4f2af_2
packaging=21.3=pyhd8ed1ab_0
pandas=1.2.5=py38h1f261ad_0
pandoc=2.18=h694c41f_0
pandocfilters=1.5.0=pyhd8ed1ab_0
pango=1.50.7=hc4a7b6d_0
paramiko=2.11.0=pyhd8ed1ab_0
parso=0.8.3=pyhd8ed1ab_0
patsy=0.5.2=pyhd8ed1ab_0
pbzip2=1.1.13=h9d27c22_1
pcre=8.45=he49afe7_0
pcre2=10.37=ha16e1b2_0
peppy=0.31.2=pyhd8ed1ab_2
perl=5.32.1=2_h0d85af4_perl5
perl-archive-tar=2.40=pl5321hdfd78af_0
perl-carp=1.50=pl5321hd8ed1ab_0
perl-common-sense=3.75=pl5321hdfd78af_0
perl-compress-raw-bzip2=2.103=pl5321h9722bc1_0
perl-compress-raw-zlib=2.105=pl5321h9722bc1_0
perl-encode=3.17=pl5321ha5712d3_0
perl-exporter=5.74=pl5321hd8ed1ab_0
perl-exporter-tiny=1.002002=pl5321hdfd78af_0
perl-extutils-makemaker=7.64=pl5321hd8ed1ab_0
perl-io-compress=2.106=pl5321h9722bc1_0
perl-io-zlib=1.11=pl5321hdfd78af_0
perl-json=4.06=pl5321hdfd78af_0
perl-json-xs=2.34=pl5321hcd10b59_5
perl-list-moreutils=0.430=pl5321hdfd78af_0
perl-list-moreutils-xs=0.430=pl5321ha5712d3_1
perl-parent=0.238=pl5321hd8ed1ab_0
perl-pathtools=3.75=pl5321ha5712d3_3
perl-scalar-list-utils=1.62=pl5321ha5712d3_0
perl-types-serialiser=1.01=pl5321hdfd78af_0
pexpect=4.8.0=pyh9f0ad1d_2
pickleshare=0.7.5=py_1003
pigz=2.6=h5dbffcc_0
pillow=9.1.1=py38h21af888_0
pip=22.1.1=pyhd8ed1ab_0
pixman=0.40.0=hbcb3906_0
plac=1.3.5=pyhd8ed1ab_0
pluggy=1.0.0=py38h50d1736_3
ply=3.11=py_1
prettytable=3.3.0=pyhd8ed1ab_0
prometheus_client=0.14.1=pyhd8ed1ab_0
prompt-toolkit=3.0.29=pyha770c72_0
protobuf=3.20.1=py38h1c67a95_0
psutil=5.9.1=py38h0dd4459_0
pthread-stubs=0.4=hc929b4f_1001
ptyprocess=0.7.0=pyhd3deb0d_0
pulp=2.6.0=py38h50d1736_1
pure_eval=0.2.2=pyhd8ed1ab_0
py=1.11.0=pyh6c4a22f_0
pyasn1=0.4.8=py_0
pyasn1-modules=0.2.8=py_0
pycparser=2.21=pyhd8ed1ab_0
pygments=2.12.0=pyhd8ed1ab_0
pynacl=1.5.0=py38hed1de0f_1
pynndescent=0.5.7=pyh6c4a22f_0
pyopenssl=22.0.0=pyhd8ed1ab_0
pyparsing=3.0.9=pyhd8ed1ab_0
pyrsistent=0.18.1=py38hed1de0f_1
pysftp=0.2.9=py_1
pysocks=1.7.1=py38h50d1736_5
pytest=7.1.2=py38h50d1736_0
python=3.8.13=h66c20e1_0_cpython
python-dateutil=2.8.2=pyhd8ed1ab_0
python-fastjsonschema=2.15.3=pyhd8ed1ab_0
python-irodsclient=1.1.3=pyhd8ed1ab_0
python-isal=0.11.1=py38h96a0964_1
python_abi=3.8=2_cp38
pytz=2022.1=pyhd8ed1ab_0
pyu2f=0.1.5=pyhd8ed1ab_0
pyyaml=6.0=py38hed1de0f_4
pyzmq=23.0.0=py38h34ba744_0
q2-alignment=2022.2.0=py38_0
q2-composition=2022.2.0=py38_0
q2-cutadapt=2022.2.0=py38_0
q2-dada2=2022.2.0=py38_0
q2-deblur=2022.2.0=py38_0
q2-demux=2022.2.0=py38_0
q2-diversity=2022.2.1=py38_0
q2-diversity-lib=2022.2.1=py38_0
q2-emperor=2022.2.0=py38_0
q2-feature-classifier=2022.2.0=py38_0
q2-feature-table=2022.2.0=py38_0
q2-fragment-insertion=2022.2.0=py38_0
q2-gneiss=2022.2.0=py38_0
q2-longitudinal=2022.2.0=py38_0
q2-metadata=2022.2.0=py38_0
q2-mystery-stew=2022.2.0=py38_0
q2-phylogeny=2022.2.0=py38_0
q2-quality-control=2022.2.0=py38_0
q2-quality-filter=2022.2.0=py38_0
q2-sample-classifier=2022.2.0=py38_0
q2-taxa=2022.2.0=py38_0
q2-types=2022.2.0=py38_0
q2-vsearch=2022.2.0=py38_0
q2cli=2022.2.0=py38_0
q2galaxy=2022.2.0=py38_0
q2templates=2022.2.0=py38_0
qiime2=2022.2.1=py38_0
r-backports=1.4.1=r41h28b5c78_0
r-base=4.1.3=h234e2ac_1
r-bh=1.78.0_0=r41hc72bb7e_0
r-bitops=1.0_7=r41h28b5c78_0
r-brio=1.1.3=r41h28b5c78_0
r-callr=3.7.0=r41hc72bb7e_0
r-cli=3.3.0=r41h8619c4b_0
r-cluster=2.1.3=r41h8e0a2a9_0
r-colorspace=2.0_3=r41h0f1d5c4_0
r-crayon=1.5.1=r41hc72bb7e_0
r-desc=1.4.1=r41hc72bb7e_0
r-diffobj=0.3.5=r41h28b5c78_0
r-digest=0.6.29=r41h9951f98_0
r-ellipsis=0.3.2=r41h28b5c78_0
r-evaluate=0.15=r41hc72bb7e_0
r-fansi=1.0.3=r41h0f1d5c4_0
r-farver=2.1.0=r41h9951f98_0
r-formatr=1.12=r41hc72bb7e_0
r-futile.logger=1.4.3=r41hc72bb7e_1003
r-futile.options=1.0.1=r41hc72bb7e_1002
r-ggplot2=3.3.6=r41hc72bb7e_0
r-glue=1.6.2=r41h0f1d5c4_0
r-gtable=0.3.0=r41hc72bb7e_3
r-hwriter=1.3.2.1=r41hc72bb7e_0
r-isoband=0.2.5=r41h9951f98_0
r-jpeg=0.1_9=r41h28b5c78_0
r-jsonlite=1.8.0=r41h0f1d5c4_0
r-labeling=0.4.2=r41hc72bb7e_1
r-lambda.r=1.2.4=r41hc72bb7e_1
r-lattice=0.20_45=r41h28b5c78_0
r-latticeextra=0.6_29=r41hc72bb7e_1
r-lifecycle=1.0.1=r41hc72bb7e_0
r-magrittr=2.0.3=r41h0f1d5c4_0
r-mass=7.3_57=r41h67d6963_0
r-matrix=1.4_1=r41ha2825d1_0
r-matrixstats=0.62.0=r41h0f1d5c4_0
r-mgcv=1.8_40=r41h60b693f_0
r-munsell=0.5.0=r41hc72bb7e_1004
r-nlme=3.1_157=r41h8e0a2a9_0
r-permute=0.9_7=r41hc72bb7e_0
r-pillar=1.7.0=r41hc72bb7e_0
r-pkgconfig=2.0.3=r41hc72bb7e_1
r-pkgload=1.2.4=r41h9951f98_0
r-plyr=1.8.7=r41hc4bb905_0
r-png=0.1_7=r41h28b5c78_1004
r-praise=1.0.0=r41hc72bb7e_1005
r-processx=3.5.3=r41h0f1d5c4_0
r-ps=1.7.0=r41h67d6963_0
r-r6=2.5.1=r41hc72bb7e_0
r-rcolorbrewer=1.1_3=r41h785f33e_0
r-rcpp=1.0.8.3=r41hc4bb905_0
r-rcppparallel=5.1.5=r41h9951f98_0
r-rcurl=1.98_1.6=r41h28b5c78_0
r-rematch2=2.1.2=r41hc72bb7e_1
r-reshape2=1.4.4=r41h9951f98_1
r-rlang=1.0.2=r41hc4bb905_0
r-rprojroot=2.0.3=r41hc72bb7e_0
r-rstudioapi=0.13=r41hc72bb7e_0
r-scales=1.2.0=r41hc72bb7e_0
r-snow=0.4_4=r41hc72bb7e_0
r-stringi=1.7.6=r41ha37f9d2_2
r-stringr=1.4.0=r41hc72bb7e_2
r-testthat=3.1.4=r41h8619c4b_0
r-tibble=3.1.7=r41h67d6963_0
r-utf8=1.2.2=r41h28b5c78_0
r-vctrs=0.4.1=r41hc4bb905_0
r-vegan=2.6_2=r41h6f100c1_0
r-viridislite=0.4.0=r41hc72bb7e_0
r-waldo=0.4.0=r41hc72bb7e_0
r-withr=2.5.0=r41hc72bb7e_0
ratelimiter=1.2.0=py38h32f6830_1001
raxml=8.2.12=ha5712d3_4
readline=8.1=h05e3726_0
requests=2.27.1=pyhd8ed1ab_0
retry=0.9.2=py_0
rsa=4.8=pyhd8ed1ab_0
s3transfer=0.6.0=pyhd8ed1ab_0
samtools=1.15.1=h9f30945_0
scikit-bio=0.5.6=py38hf3d72b9_4
scikit-learn=0.24.1=py38hfd19401_0
scipy=1.8.1=py38h2c99f22_0
seaborn=0.11.2=hd8ed1ab_0
seaborn-base=0.11.2=pyhd8ed1ab_0
send2trash=1.8.0=pyhd8ed1ab_0
sepp=4.3.10=py38h3252c3a_2
setuptools=59.8.0=py38h50d1736_1
sigtool=0.1.3=h88f4db0_0
six=1.16.0=pyh6c4a22f_0
slacker=0.14.0=py_0
smart_open=6.0.0=pyhd8ed1ab_0
smmap=3.0.5=pyh44b312d_0
snakemake=7.8.3=hdfd78af_0
snakemake-minimal=7.8.3=pyhdfd78af_0
sortmerna=2.0=h5c9b4e4_4
soupsieve=2.3.1=pyhd8ed1ab_0
sqlite=3.38.5=hd9f0692_0
stack_data=0.2.0=pyhd8ed1ab_0
statsmodels=0.13.2=py38hbe852b5_0
stone=3.3.1=pyhd8ed1ab_0
stopit=1.1.2=py_0
tabulate=0.8.10=pyhd8ed1ab_0
tapi=1100.0.11=h9ce4665_0
tbb=2020.2=h940c156_4
terminado=0.15.0=py38h50d1736_0
threadpoolctl=3.1.0=pyh8a188c0_0
tinycss2=1.1.1=pyhd8ed1ab_0
tk=8.6.12=h5dbffcc_0
tktable=2.10=h49f0cf7_3
tomli=2.0.1=pyhd8ed1ab_0
toposort=1.7=pyhd8ed1ab_0
tornado=6.1=py38hed1de0f_3
tqdm=4.64.0=pyhd8ed1ab_0
traitlets=5.2.1.post0=pyhd8ed1ab_0
typing-extensions=4.2.0=hd8ed1ab_1
typing_extensions=4.2.0=pyha770c72_1
tzlocal=2.1=pyh9f0ad1d_0
ubiquerg=0.6.1=pyh9f0ad1d_0
umap-learn=0.5.3=py38h50d1736_0
unicodedata2=14.0.0=py38hed1de0f_1
unifrac=0.20.3=py38hdf9d520_0
uritemplate=4.1.1=pyhd8ed1ab_0
urllib3=1.26.9=pyhd8ed1ab_0
veracitools=0.1.3=py_0
vsearch=2.7.0=1
wcwidth=0.2.5=pyh9f0ad1d_2
webencodings=0.5.1=py_1
wheel=0.37.1=pyhd8ed1ab_0
widgetsnbextension=3.6.0=py38h50d1736_0
wrapt=1.14.1=py38h0dd4459_0
xopen=1.5.0=py38h50d1736_0
xorg-libxau=1.0.9=h35c211d_0
xorg-libxdmcp=1.1.3=h35c211d_0
xz=5.2.5=haf1e3a3_1
yaml=0.2.5=h0d85af4_2
yarl=1.7.2=py38hed1de0f_2
yte=1.5.1=py38h50d1736_0
zeromq=4.3.4=he49afe7_1
zipp=3.8.0=pyhd8ed1ab_0
zlib=1.2.11=h6c3fc93_1014
zstd=1.5.2=ha9df2e0_1
pip:
- empress==1.2.0.dev0
- iow==0.1.3 prefix: /Users/nastassia.patin/miniconda3/envs/qiime2-2022.2

nvpatin commented 2 years ago

I resolved this problem by changing my parameters for --p-min-feature-count and --p-min-sample-count in QIIME2 (I changed both to 1). I'm not sure how it could be resolved in the standalone DEICODE. This issue can be considered resolved now.

nvpatin commented 2 years ago

Hi again, I am getting another segmentation fault after I increased the size of my data set. Once again it occurs with both the standalone and QIIME2 plug-in for DEICODE. I tried changing several parameters; here are my most recent:

in QIIME2

qiime deicode rpca --i-table metaflye_hybrid_illumina_dfs.qza --p-min-feature-count 1 --p-min-sample-count 500 --o-biplot metaflye_hybrid_illumina_deicode_ordination.qza --o-distance-matrix metaflye_hybrid_illumina_deicode_distance.qza --p-max-iterations 1 --p-n-components 2

Standalone DEICODE

deicode auto-rpca --in-biom metaflye_hybrid_illumina_dfs.biom --output-dir Lasker2019_ORFs_metaflye_illumina_hybrid-deicode

I attached the tab-separated text file of the table that I converted to BIOM and QIIME2 .qza formats. It is not a standard amplicon data set, rather it reflects the number of taxonomically annotated ORFs for a set of shotgun metagenomes. I did successfully use a similar but smaller file earlier. metaflye_hybrid_illumina_dfs.txt

mortonjt commented 2 years ago

Hi, it may be worthwhile to compute feature_count (i.e. the number of samples a given feature is observed in) as well as the sample count (number of observed OTUs )

import pandas as pd
import qiime2
df =
qiime2.Artifact.load('metaflye_hybrid_illumina_dfs.qza').view(pd.DataFrame)
feature_count = (df > 0).sum(axis=0)
sample_count = (df > 0).sum(axis=1)

If you have any microbes that aren't observed in any samples, or samples with no microbes, that will lead to a segfault.

On Mon, Oct 3, 2022 at 4:07 PM Nastassia Patin @.***> wrote:

Hi again, I am getting another segmentation fault after I increased the size of my data set. Once again it occurs with both the standalone and QIIME2 plug-in for DEICODE. I tried changing several parameters; here are my most recent: in QIIME2

qiime deicode rpca --i-table metaflye_hybrid_illumina_dfs.qza --p-min-feature-count 1 --p-min-sample-count 500 --o-biplot metaflye_hybrid_illumina_deicode_ordination.qza --o-distance-matrix metaflye_hybrid_illumina_deicode_distance.qza --p-max-iterations 1 --p-n-components 2 Standalone DEICODE

deicode auto-rpca --in-biom metaflye_hybrid_illumina_dfs.biom --output-dir Lasker2019_ORFs_metaflye_illumina_hybrid-deicode

I attached the tab-separated text file of the table that I converted to BIOM and QIIME2 .qza formats. metaflye_hybrid_illumina_dfs.txt https://github.com/biocore/DEICODE/files/9700329/metaflye_hybrid_illumina_dfs.txt

— Reply to this email directly, view it on GitHub https://github.com/biocore/DEICODE/issues/65#issuecomment-1265969575, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXM3ROQMIIQU2C6SD53WBM4H3ANCNFSM6AAAAAAQ2FOLKQ . You are receiving this because you commented.Message ID: @.***>

nvpatin commented 2 years ago

I did as you suggested. My min(feature_count) is 1 and my min(sample_count) is 121.

nvpatin commented 2 years ago

Are there ways to modify the default parameters in the standalone DEICODE command? I know how to do it in QIIME2 but I would like to try similar modifications in the standalone tool.

cameronmartino commented 2 years ago

Yes and thanks for posting the issue and the data. I was able to replicate the issue. I will need to dig into the error because it is not immediately apparent why it is happening. I have run much bigger and more square datasets without an issue, so I am not convinced it is entirely memory related but it does seem related to the feature space size.

A temporary fix is to reduce the feature space with a frequency filter. Using my laptop I had to use a pretty extreme filter, removing any features in less than 50% of the samples. Maybe on compute cluster you could do less (e.g. 10).

The following commands both worked on my laptop:

deicode rpca --in-biom metaflye_hybrid_illumina_dfs.biom --output-dir metaflye_hybrid_illumina_dfs_test --min-feature-frequency 50

qiime deicode rpca --i-table metaflye_hybrid_illumina_dfs.qza --output-dir metaflye_hybrid_illumina_dfs_test -p-min-feature-frequency 50

and here are all the parameters in the standalone command:

Usage: deicode rpca [OPTIONS]

  Runs RPCA with an rclr preprocessing step.

Options:
  --in-biom TEXT                  Input table in biom format.  [required]
  --output-dir TEXT               Location of output files.  [required]
  --n_components INTEGER          The underlying low-rank structure. The input
                                  can be an integer (suggested: 1 < rank < 10)
                                  [minimum 2]. Note: as the rank increases the
                                  runtime will increase dramatically.
                                  [default: 3]
  --min-sample-count INTEGER      Minimum sum cutoff of sample across all
                                  features. The value can be at minimum zero
                                  and must be an whole integer. It is
                                  suggested to be greater than or equal to
                                  500.  [default: 500]
  --min-feature-count INTEGER     Minimum sum cutoff of features across all
                                  samples. The value can be at minimum zero
                                  and must be an whole integer  [default: 10]
  --min-feature-frequency INTEGER
                                  Minimum percentage of samples a feature must
                                  appear with a value greater than zero. This
                                  value can range from 0 to 100 with decimal
                                  values allowed.  [default: 0]
  --max_iterations INTEGER        The number of iterations to optimize the
                                  solution (suggested to be below 100; beware
                                  of overfitting) [minimum 1]  [default: 5]
  --help                          Show this message and exit.

nvpatin commented 2 years ago

Thanks for this information. I was able to run it on a cluster with --min-feature-frequency 40, which is higher than I would like but will be ok for now.

If you can identify the problem please let me know! I appreciate the responses and effort.

nvpatin commented 2 years ago

To follow up on this puzzle: I thought the problem might be that one set of samples (~1/3 of the whole data set) are extremely sparse in their composition, with lots of zeros and otherwise generally low count values compared to the other 2/3 of samples. So I converted the count table to a presence/absence matrix and tried to run it again, but I STILL got a segfault with the default parameters in standalone deicode! I attached the presence/absence matrix as a text file here. It won't let me upload the BIOM table but I can email it to you if you would like. metaflye_hybrid_illumina_dfs-presabs.txt

mortonjt commented 2 years ago

This wont work with presence / absence matrices (zeros are treated as missing)

On Thu, Oct 6, 2022 at 8:06 PM Nastassia Patin @.***> wrote:

To follow up on this puzzle: I thought the problem might be that one set of samples (~1/3 of the whole data set) are extremely sparse in their composition, with lots of zeros and otherwise generally low count values compared to the other 2/3 of samples. So I converted the count table to a presence/absence matrix and tried to run it again, but I STILL got a segfault with the default parameters in standalone deicode! I attached the presence/absence matrix as a text file here. It won't let me upload the BIOM table but I can email it to you if you would like. metaflye_hybrid_illumina_dfs-presabs.txt https://github.com/biocore/DEICODE/files/9729350/metaflye_hybrid_illumina_dfs-presabs.txt

— Reply to this email directly, view it on GitHub https://github.com/biocore/DEICODE/issues/65#issuecomment-1270858533, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXIAKXWNBMEZA76VX7DWB5SPDANCNFSM6AAAAAAQ2FOLKQ . You are receiving this because you commented.Message ID: @.***>

nvpatin commented 2 years ago

Ok, so would you agree that is probably the original source of the problem? Seems like I may need to transform the data somehow (maybe just add a count of 1 to every value).

biocore / DEICODE

segmentation fault #65

in QIIME2

Standalone DEICODE