JEFworks-Lab / STalign

Python tool for alignment of spatial transcriptomics (ST) data using diffeomorphic metric mapping
https://jef.works/STalign/
GNU General Public License v3.0
61 stars 10 forks source link

Out of memory error #26

Closed ebolotin02 closed 6 months ago

ebolotin02 commented 8 months ago

In the merfish tutorial the STalign runs out of ram an crashed. I have 500 Gig of RAM linux machine running ubuntu 22 I am installing STalign in the conda enviroment like this:

conda environment create "/root/tools/st_align_contain" python=3.9 conda enviroment activate -p "/root/tools/st_align_contain" pip install --upgrade "git+https://github.com/JEFworks-Lab/STalign.git"

Starting the enviroment:

python: /opt/mamba/envs/stalign_conda_v3/bin/python libpython: /opt/mamba/envs/stalign_conda_v3/lib/libpython3.10.so pythonhome: /opt/mamba/envs/stalign_conda_v3:/opt/mamba/envs/stalign_conda_v3 version: 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0] numpy: /opt/mamba/envs/stalign_conda_v3/lib/python3.10/site-packages/numpy numpy_version: 1.23.4

NOTE: Python version was forced by use_python function Listing packages:

here is a list of packages in an environment

packages in environment at /root/tools/st_align_contain:

#

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge alabaster 0.7.13 pypi_0 pypi attrs 23.2.0 pypi_0 pypi autodocsumm 0.2.10 pypi_0 pypi babel 2.12.1 pypi_0 pypi beautifulsoup4 4.12.2 pypi_0 pypi bleach 6.1.0 pypi_0 pypi bzip2 1.0.8 hd590300_5 conda-forge c-ares 1.24.0 hd590300_0 conda-forge ca-certificates 2023.11.17 hbcca054_0 conda-forge certifi 2023.11.17 pypi_0 pypi charset-normalizer 2.1.1 pypi_0 pypi cmake 3.28.1 pypi_0 pypi contourpy 1.0.7 pypi_0 pypi curl 8.5.0 hca28451_0 conda-forge cycler 0.11.0 pypi_0 pypi defusedxml 0.7.1 pypi_0 pypi docopt 0.6.2 pypi_0 pypi docutils 0.18.1 pypi_0 pypi et-xmlfile 1.1.0 pypi_0 pypi fastjsonschema 2.19.1 pypi_0 pypi filelock 3.11.0 pypi_0 pypi fonttools 4.39.3 pypi_0 pypi gettext 0.21.1 h27087fc_0 conda-forge git 2.43.0 pl5321h7bc287a_0 conda-forge idna 3.6 pypi_0 pypi imageio 2.33.1 pypi_0 pypi imagesize 1.4.1 pypi_0 pypi importlib-metadata 7.0.1 pypi_0 pypi importlib-resources 6.1.1 pypi_0 pypi jinja2 3.1.2 pypi_0 pypi jsonschema 4.20.0 pypi_0 pypi jsonschema-specifications 2023.12.1 pypi_0 pypi jupyter-client 8.6.0 pypi_0 pypi jupyter-core 5.7.0 pypi_0 pypi jupyterlab-pygments 0.3.0 pypi_0 pypi keyutils 1.6.1 h166bdaf_0 conda-forge kiwisolver 1.4.4 pypi_0 pypi krb5 1.21.2 h659d440_0 conda-forge lazy-loader 0.3 pypi_0 pypi ld_impl_linux-64 2.40 h41732ed_0 conda-forge libcurl 8.5.0 hca28451_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 hd590300_2 conda-forge libexpat 2.5.0 hcb278e6_1 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 13.2.0 h807b86a_3 conda-forge libgomp 13.2.0 h807b86a_3 conda-forge libiconv 1.17 hd590300_2 conda-forge libnghttp2 1.58.0 h47da74e_1 conda-forge libnsl 2.0.1 hd590300_0 conda-forge libsqlite 3.44.2 h2797004_0 conda-forge libssh2 1.11.0 h0841786_0 conda-forge libstdcxx-ng 13.2.0 h7e041cc_3 conda-forge libuuid 2.38.1 h0b41bf4_0 conda-forge libxcrypt 4.4.36 hd590300_1 conda-forge libzlib 1.2.13 hd590300_5 conda-forge lit 17.0.6 pypi_0 pypi markdown-it-py 2.2.0 pypi_0 pypi markupsafe 2.1.3 pypi_0 pypi matplotlib 3.8.2 pypi_0 pypi mdit-py-plugins 0.3.5 pypi_0 pypi mdurl 0.1.2 pypi_0 pypi mistune 3.0.2 pypi_0 pypi mpmath 1.3.0 pypi_0 pypi myst-parser 1.0.0 pypi_0 pypi nbclient 0.9.0 pypi_0 pypi nbconvert 7.14.0 pypi_0 pypi nbformat 5.9.2 pypi_0 pypi nbsphinx 0.9.1 pypi_0 pypi ncurses 6.4 h59595ed_2 conda-forge networkx 3.1 pypi_0 pypi nptyping 2.5.0 pypi_0 pypi numpy 1.23.4 pypi_0 pypi nvidia-cublas-cu11 11.10.3.66 pypi_0 pypi nvidia-cuda-cupti-cu11 11.7.101 pypi_0 pypi nvidia-cuda-nvrtc-cu11 11.7.99 pypi_0 pypi nvidia-cuda-runtime-cu11 11.7.99 pypi_0 pypi nvidia-cudnn-cu11 8.5.0.96 pypi_0 pypi nvidia-cufft-cu11 10.9.0.58 pypi_0 pypi nvidia-curand-cu11 10.2.10.91 pypi_0 pypi nvidia-cusolver-cu11 11.4.0.1 pypi_0 pypi nvidia-cusparse-cu11 11.7.4.91 pypi_0 pypi nvidia-nccl-cu11 2.14.3 pypi_0 pypi nvidia-nvtx-cu11 11.7.91 pypi_0 pypi openpyxl 3.1.1 pypi_0 pypi openssl 3.2.0 hd590300_1 conda-forge packaging 23.2 pypi_0 pypi pandas 2.0.0 pypi_0 pypi pandocfilters 1.5.0 pypi_0 pypi pcre2 10.42 hcad00b1_0 conda-forge perl 5.32.1 7_hd590300_perl5 conda-forge pillow 9.5.0 pypi_0 pypi pims 0.3.0 pypi_0 pypi pip 23.3.2 pyhd8ed1ab_0 conda-forge pipreqs 0.4.13 pypi_0 pypi platformdirs 4.1.0 pypi_0 pypi plotly 5.14.1 pypi_0 pypi pygments 2.15.0 pypi_0 pypi pynrrd 1.0.0 pypi_0 pypi pypandoc 1.11 pypi_0 pypi pyparsing 3.1.1 pypi_0 pypi python 3.9.18 h0755675_1_cpython conda-forge python-dateutil 2.8.2 pypi_0 pypi pytz 2023.3 pypi_0 pypi pyyaml 6.0 pypi_0 pypi pyzmq 25.1.2 pypi_0 pypi readline 8.2 h8228510_1 conda-forge readthedocs-sphinx-search 0.3.1 pypi_0 pypi referencing 0.32.0 pypi_0 pypi requests 2.28.1 pypi_0 pypi rpds-py 0.16.2 pypi_0 pypi scikit-image 0.22.0 pypi_0 pypi scipy 1.11.4 pypi_0 pypi setuptools 69.0.3 pyhd8ed1ab_0 conda-forge six 1.16.0 pypi_0 pypi snowballstemmer 2.2.0 pypi_0 pypi soupsieve 2.5 pypi_0 pypi sphinx 6.1.3 pypi_0 pypi sphinx-rtd-theme 1.2.0 pypi_0 pypi sphinxcontrib-applehelp 1.0.4 pypi_0 pypi sphinxcontrib-devhelp 1.0.2 pypi_0 pypi sphinxcontrib-htmlhelp 2.0.1 pypi_0 pypi sphinxcontrib-jquery 4.1 pypi_0 pypi sphinxcontrib-jsmath 1.0.1 pypi_0 pypi sphinxcontrib-qthelp 1.0.3 pypi_0 pypi sphinxcontrib-serializinghtml 1.1.5 pypi_0 pypi stalign 1.0 pypi_0 pypi sympy 1.11.1 pypi_0 pypi tenacity 8.2.2 pypi_0 pypi tifffile 2023.12.9 pypi_0 pypi tinycss2 1.2.1 pypi_0 pypi tk 8.6.13 noxft_h4845f30_101 conda-forge torch 2.0.0 pypi_0 pypi tornado 6.2 pypi_0 pypi traitlets 5.14.1 pypi_0 pypi triton 2.0.0 pypi_0 pypi typing-extensions 4.9.0 pypi_0 pypi tzdata 2023.4 pypi_0 pypi urllib3 1.26.18 pypi_0 pypi webencodings 0.5.1 pypi_0 pypi wheel 0.42.0 pyhd8ed1ab_0 conda-forge xz 5.2.6 h166bdaf_0 conda-forge yarg 0.1.9 pypi_0 pypi zipp 3.17.0 pypi_0 pypi zstd 1.5.5 hfc55251_0 conda-forge

Error message /root/tools/st_align_contain/lib/python3.9/site-packages/STalign/STalign.py:1043: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). L = torch.tensor(L,device=device,dtype=dtype,requires_grad=True) /root/tools/st_align_contain/lib/python3.9/site-packages/STalign/STalign.py:1044: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). T = torch.tensor(T,device=device,dtype=dtype,requires_grad=True) /root/tools/st_align_contain/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

ebolotin02 commented 8 months ago

Forgot to mention, rasterize works fine. The crash happens here:

out = STalign.LDDMM([xI,yI],I,[xJ,yJ],J,device=device,niter=1000,epV=50)

JEFworks commented 8 months ago

Dear Eugene,

Thanks for sharing your session info. Since the rasterization worked fine, you have enough memory to store the tensor. So my suspicion is that this is related this particular installation of torch and perhaps there is a memory leak in the iterative gradient descent. To check this, could you please try running the LDDMM alignment but with a very small number of iterations? ex.

out = STalign.LDDMM([xI,yI],I,[xJ,yJ],J,device=device,niter=5,epV=50)

If it no longer crashes, then we've confirmed there's a memory leak.

One potential solution is also to change to a different version of torch, as these errors can be machine/version specific (https://discuss.pytorch.org/t/tips-tricks-on-finding-cpu-memory-leaks/115971/3) though I am using torch 2.0.0 as well and am not able to reproduce this on my machine with python 3.9.10.

Best, Jean

ebolotin02 commented 8 months ago

Hi Jean, somehow the memory leak was resolved. Thank you.