Error in estimate-orf-bayes-factors

eboileau commented 1 year ago

Prerequisites Please answer the following questions for yourself before submitting an issue.

[X] I am running the latest version (branch dev-ssciwr), and specified versions of the dependencies
[X] I checked the documentation and found no answer
[X] This issue has not already been filed

Description So far, we have run the tests on the small c-elegans dataset. I ran the pipeline on a larger dataset, and it ran until

INFO     rpbp.translation_prediction.estimate_orf_bayes_factors 2022-12-15 17:47:55,684 : Number of regions after filtering: 558068
DEBUG    rpbp.translation_prediction.estimate_orf_bayes_factors 2022-12-15 17:47:55,686 : Reading models
DEBUG    cmdstanpy 2022-12-15 17:47:55,824 : found newer exe file, not recompiling
DEBUG    cmdstanpy 2022-12-15 17:47:55,948 : found newer exe file, not recompiling

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/joblib/_parallel_backends.py", line 595, in __call__
    return self.func(*args, **kwargs)
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/joblib/parallel.py", line 263, in __call__
    return [func(*args, **kwargs)
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/joblib/parallel.py", line 263, in <listcomp>
    return [func(*args, **kwargs)
  File "/beegfs/prj/rpbp-dev/bullseye/rpbp_cmdstanpy/rp-bp/src/rpbp/translation_prediction/estimate_orf_bayes_factors.py", line 302, in get_all_bayes_factors_args
    row_bf = get_bayes_factor(profile, translated_models, untranslated_models, args)
  File "/beegfs/prj/rpbp-dev/bullseye/rpbp_cmdstanpy/rp-bp/src/rpbp/translation_prediction/estimate_orf_bayes_factors.py", line 135, in get_bayes_factor
    m_translated = [
  File "/beegfs/prj/rpbp-dev/bullseye/rpbp_cmdstanpy/rp-bp/src/rpbp/translation_prediction/estimate_orf_bayes_factors.py", line 136, in <listcomp>
    tm.sample(
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/cmdstanpy/model.py", line 1018, in sample
    with MaybeDictToFilePath(data, inits) as (_data, _inits):
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/cmdstanpy/utils/filesystem.py", line 120, in __init__
    data_file = create_named_text_file(
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/cmdstanpy/utils/filesystem.py", line 86, in create_named_text_file
    fd = tempfile.NamedTemporaryFile(
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/tempfile.py", line 559, in NamedTemporaryFile
    file = _io.open(dir, mode, buffering=buffering,
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/tempfile.py", line 556, in opener
    fd, name = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/tempfile.py", line 256, in _mkstemp_inner
    fd = _os.open(file, flags, 0o600)
OSError: [Errno 28] No space left on device: '/tmp/tmpgpfnpi41/ksz4e881.json'

for another sample, the error is a little different:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/joblib/_parallel_backends.py", line 595, in __call__
    return self.func(*args, **kwargs)
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/joblib/parallel.py", line 263, in __call__
    return [func(*args, **kwargs)
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/joblib/parallel.py", line 263, in <listcomp>
    return [func(*args, **kwargs)
  File "/beegfs/prj/rpbp-dev/bullseye/rpbp_cmdstanpy/rp-bp/src/rpbp/translation_prediction/estimate_orf_bayes_factors.py", line 302, in get_all_bayes_factors_args
    row_bf = get_bayes_factor(profile, translated_models, untranslated_models, args)
  File "/beegfs/prj/rpbp-dev/bullseye/rpbp_cmdstanpy/rp-bp/src/rpbp/translation_prediction/estimate_orf_bayes_factors.py", line 149, in get_bayes_factor
    m_background = [
  File "/beegfs/prj/rpbp-dev/bullseye/rpbp_cmdstanpy/rp-bp/src/rpbp/translation_prediction/estimate_orf_bayes_factors.py", line 150, in <listcomp>
    bm.sample(
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/cmdstanpy/model.py", line 1197, in sample
    mcmc = CmdStanMCMC(runset)
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/cmdstanpy/stanfit/mcmc.py", line 103, in __init__
    config = self._validate_csv_files()
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/cmdstanpy/stanfit/mcmc.py", line 302, in _validate_csv_files
    drest = check_sampler_csv(
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/cmdstanpy/utils/stancsv.py", line 44, in check_sampler_csv
    meta = scan_sampler_csv(path, is_fixed_param)
  File "/prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/cmdstanpy/utils/stancsv.py", line 94, in scan_sampler_csv
    with open(path, 'r') as fd:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp32vfwjff/gaussian-naive-bayeswk47vsmi/gaussian-naive-bayes-20221215232753_2.csv'
"""

I don't know whether this is related to some interaction cmdstanpy and parallel? I doubt there is no space left on the cluster... unless maybe this is related to the number of open files...?

We also need to fix the cmdstanpy verbose output...

To Reproduce Run the pipeline run-all-rpbp-instances ...

Environment

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
ansi2html                 1.8.0           py310hff52083_1    conda-forge
anyio                     3.6.2                    pypi_0    pypi
appdirs                   1.4.4                    pypi_0    pypi
argon2-cffi               21.3.0                   pypi_0    pypi
argon2-cffi-bindings      21.2.0                   pypi_0    pypi
asttokens                 2.1.0              pyhd8ed1ab_0    conda-forge
attrs                     21.4.0             pyhd3eb1b0_0  
babel                     2.11.0                   pypi_0    pypi
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
beautifulsoup4            4.11.1                   pypi_0    pypi
binutils_impl_linux-64    2.36.1               h193b22a_2    conda-forge
binutils_linux-64         2.36                hf3e587d_33    conda-forge
bio                       1.4.0                    pypi_0    pypi
biopython                 1.79                     pypi_0    pypi
biothings-client          0.2.6                    pypi_0    pypi
blas                      1.0                         mkl  
bleach                    5.0.1                    pypi_0    pypi
bokeh                     2.4.3           py310h06a4308_0  
bottleneck                1.3.5           py310ha9d4c09_0  
bowtie2                   2.4.5           py310he297e0b_3    bioconda
brotli-python             1.0.9           py310hd8f1fbe_8    conda-forge
brotlipy                  0.7.0           py310h7f8727e_1002
bzip2                     1.0.8                h7b6447c_0  
c-ares                    1.18.1               h7f8727e_0  
ca-certificates           2022.12.7            ha878542_0    conda-forge
certifi                   2022.12.7          pyhd8ed1ab_0    conda-forge
cffi                      1.15.1          py310h74dc2b5_0  
charset-normalizer        2.1.1              pyhd8ed1ab_0    conda-forge
click                     8.1.3                    pypi_0    pypi
cloudpickle               2.2.0                    pypi_0    pypi
cmdstan                   2.27.0               hece972b_0    conda-forge
cmdstanpy                 1.0.8              pyhd8ed1ab_0    conda-forge
colorama                  0.4.6                    pypi_0    pypi
colour                    0.1.5                      py_0    conda-forge
comm                      0.1.0              pyhd8ed1ab_0    conda-forge
contourpy                 1.0.6                    pypi_0    pypi
coverage                  6.5.0                    pypi_0    pypi
cramjam                   2.6.1                    pypi_0    pypi
cryptography              38.0.1          py310h9ce1e76_0  
cycler                    0.11.0                   pypi_0    pypi
cytoolz                   0.12.0          py310h5eee18b_0  
dash                      2.7.0              pyhd8ed1ab_0    conda-forge
dash-bio                  1.0.2              pyhd8ed1ab_0    conda-forge
dash-bootstrap-components 1.2.1              pyhd8ed1ab_0    conda-forge
dash-core-components      2.0.0              pyhd8ed1ab_1    conda-forge
dash-html-components      2.0.0              pyhd8ed1ab_1    conda-forge
dash-table                5.0.0              pyhd8ed1ab_1    conda-forge
dask                      2022.10.2                pypi_0    pypi
dask-core                 2022.7.0        py310h06a4308_0  
datacache                 1.1.5                    pypi_0    pypi
dbus                      1.13.18              hb2f20db_0  
debugpy                   1.6.3           py310hd8f1fbe_1    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
defusedxml                0.7.1                    pypi_0    pypi
distributed               2022.7.0        py310h06a4308_0  
entrypoints               0.4                pyhd8ed1ab_0    conda-forge
et-xmlfile                1.1.0                    pypi_0    pypi
executing                 1.2.0              pyhd8ed1ab_0    conda-forge
expat                     2.4.9                h6a678d5_0
fastjsonschema            2.16.2                   pypi_0    pypi
fastparquet               0.8.3                    pypi_0    pypi
fastqc                    0.11.9               hdfd78af_1    bioconda
fftw                      3.3.9                h27cfd23_1  
flask                     2.2.2              pyhd8ed1ab_0    conda-forge
flask-compress            1.13               pyhd8ed1ab_0    conda-forge
flexbar                   3.5.0                hf53871c_6    bioconda
font-ttf-dejavu-sans-mono 2.37                 hd3eb1b0_0  
fontconfig                2.13.1               hef1e5e3_1  
fonttools                 4.38.0                   pypi_0    pypi
freetype                  2.12.1               h4a9f257_0  
fsspec                    2022.10.0       py310h06a4308_0  
future-fstrings           1.2.0                    pypi_0    pypi
gcc_impl_linux-64         9.3.0               h6df7d76_17  
gcc_linux-64              9.3.0               hf25ea35_33    conda-forge
geoparse                  2.0.3              pyhd8ed1ab_0    conda-forge
giflib                    5.2.1                h7b6447c_0  
glib                      2.69.1               h4ff587b_1  
gtfparse                  1.2.1                    pypi_0    pypi
gxx_impl_linux-64         9.3.0               hbdd7822_17  
gxx_linux-64              9.3.0               h3fbe746_33    conda-forge
heapdict                  1.0.1              pyhd3eb1b0_0  
htslib                    1.16                 h6bc39ce_0    bioconda
icu                       58.2                 he6710b0_3  
idna                      3.4             py310h06a4308_0  
importlib-metadata        5.0.0              pyha770c72_1    conda-forge
importlib_resources       5.10.0             pyhd8ed1ab_0    conda-forge
iniconfig                 1.1.1              pyhd3eb1b0_0  
intel-openmp              2021.4.0          h06a4308_3561  
ipykernel                 6.17.1                   pypi_0    pypi
ipython                   8.6.0              pyh41d4057_1    conda-forge
ipython-genutils          0.2.0                    pypi_0    pypi
itsdangerous              2.1.2              pyhd8ed1ab_0    conda-forge
jedi                      0.18.1             pyhd8ed1ab_2    conda-forge
jinja2                    3.1.2           py310h06a4308_0  
joblib                    1.1.1           py310h06a4308_0  
jpeg                      9e                   h7f8727e_0  
json5                     0.9.10                   pypi_0    pypi
jsonschema                4.17.0                   pypi_0    pypi
jupyter-client            7.4.6                    pypi_0    pypi
jupyter-dash              0.4.2              pyhd8ed1ab_1    conda-forge
jupyter-server            1.23.3                   pypi_0    pypi
jupyter_client            7.3.4              pyhd8ed1ab_0    conda-forge
jupyter_core              5.0.0           py310hff52083_0    conda-forge
jupyterlab                3.5.0                    pypi_0    pypi
jupyterlab-pygments       0.2.2                    pypi_0    pypi
jupyterlab-server         2.16.3                   pypi_0    pypi
kernel-headers_linux-64   2.6.32              he073ed8_15    conda-forge
kiwisolver                1.4.4                    pypi_0    pypi
krb5                      1.19.2               hac12032_0  
lcms2                     2.12                 h3be6417_0  
ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
libcurl                   7.85.0               h91b91d3_0  
libdeflate                1.13                 h166bdaf_0    conda-forge
libedit                   3.1.20210910         h7f8727e_0  
libev                     4.33                 h7f8727e_1  
libffi                    3.3                  he6710b0_2  
libgcc-devel_linux-64     9.3.0               hb95220a_17  
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgfortran-ng            11.2.0               h00389a5_1  
libgfortran5              11.2.0               h1234567_1  
libgomp                   12.2.0              h65d4601_19    conda-forge
libnghttp2                1.46.0               hce63b2e_0  
libpng                    1.6.37               hbc83047_0  
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libssh2                   1.10.0               h8f2d780_0  
libstdcxx-devel_linux-64  9.3.0               hf0c5c8d_17  
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libtiff                   4.2.0                hecacb30_2  
libuuid                   1.0.3                h7f8727e_2  
libwebp                   1.2.4                h11a3e52_0  
libwebp-base              1.2.4                h5eee18b_0  
libxcb                    1.15                 h7f8727e_0  
libxml2                   2.9.14               h74e7548_0  
libzlib                   1.2.13               h166bdaf_4    conda-forge
locket                    1.0.0           py310h06a4308_0  
lz4                       3.1.3           py310h7f8727e_0
make                      4.2.1                h1bed415_1  
markupsafe                2.1.1           py310h7f8727e_0  
matplotlib                3.6.2                    pypi_0    pypi
matplotlib-inline         0.1.6              pyhd8ed1ab_0    conda-forge
matplotlib-venn           0.11.7                   pypi_0    pypi
memoized-property         1.0.3                    pypi_0    pypi
mistune                   2.0.4                    pypi_0    pypi
mkl                       2021.4.0           h06a4308_640  
mkl-service               2.4.0           py310h7f8727e_0  
mkl_fft                   1.3.1           py310hd6ae3a3_0  
mkl_random                1.2.2           py310h00e6091_0  
mock                      4.0.3                    pypi_0    pypi
more-itertools            9.0.0                    pypi_0    pypi
msgpack-python            1.0.3           py310hd09550d_0  
mygene                    3.2.2                    pypi_0    pypi
nbclassic                 0.4.8                    pypi_0    pypi
nbclient                  0.7.0                    pypi_0    pypi
nbconvert                 7.2.5                    pypi_0    pypi
nbformat                  5.7.0                    pypi_0    pypi
ncurses                   6.3                  h5eee18b_3  
nest-asyncio              1.5.6              pyhd8ed1ab_0    conda-forge
networkx                  2.8.8                    pypi_0    pypi
notebook                  6.5.2                    pypi_0    pypi
notebook-shim             0.2.2                    pypi_0    pypi
numexpr                   2.8.3           py310hcea2de6_0  
numpy                     1.23.3          py310hd5efca6_1  
numpy-base                1.23.3          py310h8e6c178_1  
openjdk                   11.0.13              h87a67e3_0  
openpyxl                  3.0.10                   pypi_0    pypi
openssl                   1.1.1s               h0b41bf4_1    conda-forge
packaging                 21.3               pyhd3eb1b0_0  
pandas                    1.4.4           py310h6a678d5_0  
pandocfilters             1.5.0                    pypi_0    pypi
parmed                    3.4.3           py310hd8f1fbe_3    conda-forge
parso                     0.8.3              pyhd8ed1ab_0    conda-forge
partd                     1.3.0                    pypi_0    pypi
patsy                     0.5.2           py310h06a4308_1  
pbiotools                 2.0.0                     dev_0    <develop>
pcre                      8.45                 h295c915_0
periodictable             1.5.2                      py_0    conda-forge                                                                                                    
perl                      5.34.0               h5eee18b_1  
pexpect                   4.8.0              pyh1a96a4e_2    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    9.3.0                    pypi_0    pypi
pip                       22.2.2          py310h06a4308_0  
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_0    conda-forge
platformdirs              2.5.4                    pypi_0    pypi
plotly                    5.11.0             pyhd8ed1ab_0    conda-forge
pluggy                    1.0.0           py310h06a4308_1  
progressbar33             2.4                      pypi_0    pypi
prometheus-client         0.15.0                   pypi_0    pypi
prompt-toolkit            3.0.32                   pypi_0    pypi
psutil                    5.9.4                    pypi_0    pypi
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pure_eval                 0.2.2              pyhd8ed1ab_0    conda-forge
py                        1.11.0             pyhd3eb1b0_0  
pybedtools                0.9.0                    pypi_0    pypi
pycparser                 2.21               pyhd3eb1b0_0  
pyensembl                 2.1.0                    pypi_0    pypi
pygments                  2.13.0             pyhd8ed1ab_0    conda-forge
pyopenssl                 22.0.0             pyhd3eb1b0_0  
pyparsing                 3.0.9           py310h06a4308_0  
pyrsistent                0.19.2          py310h5764c6d_0    conda-forge
pysam                     0.20.0                   pypi_0    pypi
pysocks                   1.7.1           py310h06a4308_0  
pytest                    7.1.2           py310h06a4308_0  
pytest-cov                4.0.0                    pypi_0    pypi
pytest-depends            1.0.1                    pypi_0    pypi
python                    3.10.6               haa1d7c7_1  
python-dateutil           2.8.2              pyhd3eb1b0_0  
python_abi                3.10                    2_cp310    conda-forge
pytz                      2022.1          py310h06a4308_0
pyyaml                    6.0                      pypi_0    pypi
pyzmq                     24.0.1          py310h330234f_1    conda-forge
readline                  8.2                  h5eee18b_0  
requests                  2.28.1             pyhd8ed1ab_1    conda-forge
retrying                  1.3.3                      py_2    conda-forge
rpbp                      2.0.0                     dev_0    <develop>
samtools                  1.16.1               h6899075_1    bioconda
scikit-learn              1.1.3           py310h6a678d5_0  
scipy                     1.9.3           py310hd5efca6_0  
seaborn                   0.12.1                   pypi_0    pypi
send2trash                1.8.0                    pypi_0    pypi
seqan-library             2.4.0                         0    conda-forge
serializable              0.2.1                    pypi_0    pypi
setuptools                65.5.0          py310h06a4308_0  
simplejson                3.17.6                   pypi_0    pypi
six                       1.16.0             pyhd3eb1b0_1  
sklearn                   0.0.post1                pypi_0    pypi
sniffio                   1.3.0                    pypi_0    pypi
sortedcontainers          2.4.0              pyhd3eb1b0_0  
soupsieve                 2.3.2.post1              pypi_0    pypi
sqlite                    3.39.3               h5082296_0  
stack_data                0.6.1              pyhd8ed1ab_0    conda-forge
star                      2.7.10a              h9ee0642_0    bioconda
statsmodels               0.13.2          py310h7f8727e_0  
sysroot_linux-64          2.12                he073ed8_15    conda-forge
tabulate                  0.9.0              pyhd8ed1ab_1    conda-forge
tbb                       2020.3               hfd86e86_0  
tbb-devel                 2020.3               hfd86e86_0  
tblib                     1.7.0              pyhd3eb1b0_0  
tenacity                  8.1.0              pyhd8ed1ab_0    conda-forge
terminado                 0.17.0                   pypi_0    pypi
threadpoolctl             3.1.0              pyh8a188c0_0    conda-forge
tinycss2                  1.2.1                    pypi_0    pypi
tinytimer                 0.0.0                    pypi_0    pypi
tk                        8.6.12               h1ccaba5_0  
tomli                     2.0.1           py310h06a4308_0  
toolz                     0.12.0          py310h06a4308_0  
tornado                   6.2                      pypi_0    pypi
tqdm                      4.64.1          py310h06a4308_0
traitlets                 5.5.0              pyhd8ed1ab_0    conda-forge
typechecks                0.1.0                    pypi_0    pypi
typing_extensions         4.3.0           py310h06a4308_0  
tzdata                    2022f                h04d1e81_0  
urllib3                   1.26.12         py310h06a4308_0  
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
webencodings              0.5.1                    pypi_0    pypi
websocket-client          1.4.2                    pypi_0    pypi
werkzeug                  2.2.2              pyhd8ed1ab_0    conda-forge
wheel                     0.37.1             pyhd3eb1b0_0  
xz                        5.2.6                h5eee18b_0  
yaml                      0.2.5                h7b6447c_0  
zeromq                    4.3.4                h9c3ff4c_1    conda-forge
zict                      2.1.0           py310h06a4308_0  
zipp                      3.10.0             pyhd8ed1ab_0    conda-forge
zlib                      1.2.13               h166bdaf_4    conda-forge
zstd                      1.5.2                ha4553b6_0

lkeegan commented 1 year ago

@eboileau

sometimes on clusters the /tmp partition is relatively small but there is often another scratch partition offering temporary storage on the node that you can use
- if so, you could try to get the temporary files to be created there instead
- e.g. export TMPDIR=/path/to/local/sratch before running
but as you say it could also be an issue with too many open files
- you can see the inode usage with df -ih

lkeegan commented 1 year ago

We also need to fix the cmdstanpy verbose output...

Does this work?

import logging
cmdstanpy_logger = logging.getLogger("cmdstanpy")
cmdstanpy_logger.disabled = True

https://mc-stan.org/cmdstanpy/users-guide/outputs.html#logging

Or is the verbose output coming directly from the stan executable rather than from CmdStanPy?

eboileau commented 1 year ago

@lkeegan thanks for your feedback.

Yes, for the first issue, I'm currently looking into this. CmdStanPy uses the Python library tempfile module, and there is also an output_dir argument, I'm not sure whether this could be used to pass a /scratch partition? I need to check the documentation. I'm running again, monitoring disk usage, etc.

For the verbose output, I will try this, thanks.

lkeegan commented 1 year ago

I'm not sure if setting the ouput_dir arg would affect where any intermediate temporary files go - but you can directly set what tmpdir Python should use in the tempfile module by setting environment vars, e.g.

export TMPDIR=/path/to/local/scratch

see https://docs.python.org/3/library/tempfile.html#tempfile.mkstemp (this also applies to NamedTemporaryFile, TemporaryFile etc)

eboileau commented 1 year ago

Ok, still running, but the conclusions are pretty clear... and it's "independent" of our own parallelisation implementation. Each time a model is sampled, there is a directory created e.g. /tmp/tmprq4hm10f/periodic-gaussian-mixturezzj4r4wr. Here, we sample two models (periodic-gaussian-mixture and gaussian-naive-bayes), but for each ORF, so for ~500,000 ORFs, this makes ~1M sub-directories... and each one contains 2 files per chain (we have 4 files), they're small ~ 50K... but for 1M of them we're quickly reaching over 50GB ... The tmp directory is only cleaned on exit using atexit.register(_cleanup_tmpdir)...

One thing we could try is to specify output_dir, and implement some cleanup routine... but this is not ideal, and might not even work, i.e. I don't know when/if files are used by cmdstanpy...

I don't really know what to do... i.e we could probably try setting TMPDIR, but this needs to be easy to handle for users, and platform-independent, besides for users with limited knowledge... they might not even know which directory to use... and in the worst case, they might not even have the required space...

lkeegan commented 1 year ago

So get_bayes_factor is called in parallel, once for each ORF. This function does a bunch of cmdstanpy stuff (which generates some files that get used by cmdstanpy) and returns a result. You only care about the return value, not the generated files. Is that correct?

If so it seems like it should be fine to set output_dir within get_bayes_factor to some known tmp dir that gets cleaned up when the function returns, e.g.

def get_bayes_factor(profile, translated_models, untranslated_models, args):
    with tempfile.TemporaryDirectory() as tmpdirname:
        ...
        *.sample(..., output_dir=tmpdirname)
        ....

Does that make sense, or am I missing something?

eboileau commented 1 year ago

This could work, I will try!

eboileau commented 1 year ago

Thanks, this seems to work for the small example. I will try with the larger dataset before setting this issue as resolved.

Maybe I should also check estimate-metagene-profile-bayes-factors. Although I haven't had errors, it could be that we were just on the limit... but the way we call cmdstanpy is different there, so I need to see how many files are actually being written...

eboileau commented 1 year ago

For the logging issue, cmdstanpy_logger.disabled = True works fine, but I would like to keep the option to log for debugging. Unless it is set to False, it pollutes all log files...

eboileau commented 1 year ago

For estimate_metagene_profile_bayes_factors.py, profiles are grouped by lengths, so e.g. if we have 35 lenghts, 4 models, and the profiles are 21 in length, this makes ~3000 /tmp files, and sampling is quicker, so this is why we do not see any significant disk and/or load footprint. I would leave it as is.

For the logging issue, I just left it off by default, and added a flag to turn it on for debugging. I'm not sure why, but in estimate_metagene_profile_bayes_factors.py, this has to be done inside estimate_profile_bayes_factors, if I leave it in the main, this doesn't seem to work...

dieterich-lab / rp-bp

Error in estimate-orf-bayes-factors #134