simonzhang00 / ripser-plusplus

Ripser++: GPU-accelerated computation of Vietoris–Rips persistence barcodes
MIT License
101 stars 15 forks source link

error: command 'cmake' failed with exit status 2 #9

Open IbtihalFerwana opened 3 years ago

IbtihalFerwana commented 3 years ago

I used this command: !pip3 install ripserplusplus , and I received this error: error: command 'cmake' failed with exit status 2

CUDA = 7.6.5_10.2 GCC = 4.8.5

simonzhang00 commented 3 years ago

Where are you running this on?

IbtihalFerwana commented 3 years ago

on this: https://wiki.ncsa.illinois.edu/display/ISL20/HAL+cluster

simonzhang00 commented 3 years ago

This looks like an issue you should ask the HAL cluster maintainers. According to my experience, try module load cmake. You can also try setting up a fresh conda environment with python>=3.6.

btw, you can use pip3 install -vvv ripserplusplus to see the full error messages.

IbtihalFerwana commented 3 years ago

Ok. Thanks module load cmake did not work

IbtihalFerwana commented 3 years ago

Hello, What would be other reasons that the library fails to complete processing? the process is "Killed"? any thoughts?

simonzhang00 commented 3 years ago

out of memory on CPU side most likely. What kind of data are you running on?

IbtihalFerwana commented 3 years ago

The data is huge, a distance matrix, size of ~66K x ~66K

Any thoughts for that?

simonzhang00 commented 3 years ago

might want to sparsify the matrix: https://ripser.scikit-tda.org/en/latest/notebooks/Approximate%20Sparse%20Filtrations.html ; after forming the coo matrix, should probably try the --sparse option.

IbtihalFerwana commented 3 years ago

Thanks. I tried the sparsifying with ripser-plusplus on the cluster and received this: Segmentation fault (core dumped), any thoughts? Does that mean the memory issue still exist?

I'm now running it with ripser alone and will see.

simonzhang00 commented 3 years ago

Please see the following gist that shows how to use the distance matrix sparsification algorithm with ripserplusplus:

https://colab.research.google.com/gist/simonzhang00/5b34155b41edc27aa5e47100bda1b2a5/ripserplusplus-distancematrix-sparsification.ipynb

This shouldn't be that hard to do yourself ;)

IbtihalFerwana commented 3 years ago

Thanks,

That did not work with me, getting the sparse matrix alone returns this message Segmentation fault (core dumped) after killing the process.

I reduced the data to a very small amount (which works fine with ripser) but still the processing fails and is killed with ripserplusplus.

Here, rpp_py.run("--format point-cloud --dim 2", A), is there other options for formatting which I might try for the smaller dataset?

simonzhang00 commented 3 years ago

Please look at the gist, and notice the line: resultsparse= rpp_py.run("--format sparse", DSparse) This is how you can read COO matrices for ripserplusplus. resultsparse= rpp_py.run("--format sparse --sparse", DSparse) should run faster. Please read the README.md documentation for ripserplusplus, especially the "The ripserplusplus Python API section".

IbtihalFerwana commented 3 years ago

Yes, I followed the gist collab steps, my processing stops at here: DSparse = getApproxSparseDM(lambdas, eps, D), and returns this message Segmentation fault (core dumped).

simonzhang00 commented 3 years ago

What does that have to do with ripserplusplus? It appears like you are having systems trouble with your cluster which I am not responsible for. Why not just work in colab? You are welcome to ask questions.

IbtihalFerwana commented 3 years ago

I'll need to have collab pro, the RAM resource is being fully occupied. I'm not familiar with that, if you have any recommendations I can try.

simonzhang00 commented 3 years ago

Go to the menu bar and click Runtime-> Change runtime type -> set Runtime shape to Standard. You shouldn't really need Colab Pro unless you need to train for longer times or train multiple instances.

IbtihalFerwana commented 3 years ago

Thanks, I really appreciate your help and wish I get it working.

Now, when I opened google colab and try to follow your code, if runtime is standard I receive this error ERROR: Failed building wheel for ripserplusplus after pip install.

Any recommendations?

This is the full output:

Processing /content/ripser-plusplus Requirement already satisfied: cmake in /usr/local/lib/python3.7/dist-packages (from ripserplusplus==1.1.1) (3.12.0) Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from ripserplusplus==1.1.1) (1.19.5) Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from ripserplusplus==1.1.1) (1.4.1) Building wheels for collected packages: ripserplusplus Building wheel for ripserplusplus (setup.py) ... error ERROR: Failed building wheel for ripserplusplus Running setup.py clean for ripserplusplus Failed to build ripserplusplus Installing collected packages: ripserplusplus Running setup.py install for ripserplusplus ... error ERROR: Command errored out with exit status 1: /usr/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-se8novv6/setup.py'"'"'; file='"'"'/tmp/pip-req-build-se8novv6/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-0jkwj2gq/install-record.txt --single-version-externally-managed --compile Check the logs for full command output.

IbtihalFerwana commented 3 years ago

When I change it to GPU, the installation succeeds, but the RAM gets fully occupied and the processing fails

simonzhang00 commented 3 years ago

After enough sparsification (large enough epsilon) there should rarely be memory issues. Usually you would run out of RAM after hours of computation. Do not forget to use the --sparse option.

IbtihalFerwana commented 3 years ago

Thanks @simonzhang00 , So, I realized the main memory issue is coming from the way I'm constructing the distance matrix from my data. Before the sparsification step, getting my large ndarray in Collab crashes the memory.