andreasMazur / geoconv

A Python library for end-to-end learning on surfaces. It implements pre-processing functions that include geodesic algorithms, neural network layers that operate on surfaces, visualization tools and benchmarking functionalities.
GNU General Public License v3.0
28 stars 2 forks source link

Normalizing meshes takes a long time (8.87 it/s) #1

Closed pieris98 closed 1 year ago

pieris98 commented 1 year ago

Hey Andreas, Thank you so much for implementing the paper in code! I'm trying to run the training_demo for FAUST. The "normalizing meshes" step in preprocess_faust.py takes about 14 minutes per mesh .ply file. I was wondering if I'm doing something wrong since the code is not running on the GPU (possibly related with this python C protocol?) or if this is just a normal normalization duration and the code runs on the CPU, so I just need to be patient :)

By the way, the environment didn't work with the provided instructions, and I got it to work following these steps (I can create a PR if you want to update the README): 1.Downloaded the cuda-nvcc, cuda-cupti, cuda-libraries, and cuda-nvtx packages from the nvidia conda channel

  1. Set the protocol flag export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python since I'm running on the newer incompatible protobuf version 4.21 which got installed automatically with this conda env and python=3.10.11.
  2. Run pip install pyshot==0.0.2 since it wasn't included in requirements.txt (an easy fix just to add this in the req file)

Thank you!

pieris98 commented 1 year ago

Update: Even if downgrading to protobuf==3.20.3 (tensorflow 2.12 used in this project needs protobuf>=3.20.3) and having installed libatlas-base-dev, the mesh normalization step takes long (~8-9 mins per mesh * 100 training registration meshes=14 hours). I also put the new protobuf version in my requirements.txt, and could upload the preprocessed zip once I get it in a PR. Here's the error I got before downgrading to solve it. `TypeError: Descriptors cannot not be created directly. If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0. If you cannot immediately regenerate your protos, some other possible workarounds are:

  1. Downgrade the protobuf package to 3.20.x or lower.
  2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates `

andreasMazur commented 1 year ago

Hi Pieris, First of all, thank you for trying out GeoConv! I am always glad about getting feedback. Let me address your issues:

[...] I'm trying to run the training_demo for FAUST. The "normalizing meshes" step in preprocess_faust.py takes about 14 minutes per mesh .ply file. I was wondering if I'm doing something wrong [...]

Normalizing the meshes includes calculating geodesic diameters. GeoConv does this by calculating geodesic distances between all pairs of vertices of each triangle mesh. For this, GeoConv uses a library called pygeodesic. To the best of my knowledge, this library does not provide the functionality to outsource any computation to a GPU. That being said, it simply takes a while.

To alleviate you from computing geodesic distances and therefore reduce the computation time that's necessary to pre-process the meshes, I've provided geodesic_diameters.npy. The (new) README.md in the mpi-faust examples folder shows how to make use of that.

[...] By the way, the environment didn't work with the provided instructions [...]

I have updated the install instructions and tested it out with a complete new conda environment. It should work now.

[...] Run pip install pyshot==0.0.2 since it wasn't included in requirements.txt [...]

That pyshot-package is not the one that is required.

Thank you for the hint, I emphasized that in the new README.md in the mpi-faust examples.

Even if downgrading to protobuf==3.20.3 (tensorflow 2.12 used in this project needs protobuf>=3.20.3) and having installed libatlas-base-dev, the mesh normalization step takes long

See geodesic_diameters.npy and the new README.md in the mpi-faust examples.

[...] I also put the new protobuf version in my requirements.txt [...]

Putting the required protobuf version into requirements.txt causes a dependency error, which is why I put it as a separate installation step. I admit, the current installation instructions are a bit lengthy. I will have a look at a simplification as soon as I have time for that. Until then, the new instructions should do the trick.

[...] could upload the preprocessed zip once I get it in a PR [...]

That's a very kind offer! However, I am not allowed to re-distribute the dataset in any form. What I can do is to provide the pre-processing pipeline and necessary pre-processing values, e.g. geodesic_diameters.npy, to make it as fast as possible.

Nevertheless, keep in mind that pre-processing the meshes remains a very involved process. If you are considering making experiments with meshes, plan with enough time. If you have access to a GPU-cluster or any other form of compute cluster, use it.

Once you have pre-processed the data, you will not need to pre-process it once more. You will see a zip-file that contains all pre-processed data. training_demo.py, for example, will recognize that and work with it if the paths are set correctly. However, if you plan on working with different data or you want to change pre-processing hyperparameters, you will need to pre-process the meshes once again.

All in all, thank you for your feedback. I hope I could help you with your addressed issues.

With best regards, Andreas

pieris98 commented 1 year ago

Thank you for your fast response, @andreasMazur ! Well, I figured out that pyshot was causing the problem, here are the lines that made it possible to build pyshot in my conda env:

#BEFORE pip install -r requirements.txt
conda install flann -c conda-forge
conda install -c anaconda lz4
conda install eigenpy -c conda-forge
apt install libeigen3-dev

And in requirements.txt:

...(other packages)
pyshot @ git+https://github.com/uhlmanngroup/pyshot@18ff0f2

It's indeed quite the headache to produce the preprocessed.zip meshes for all 100 FAUST training meshes. I made the preprocess_faust.py pipeline work but simply don't have the space on my device (~724MB per GPC_tr_reg_xxx.npy so need about 70GB before making the preprocessed zip) or time (this will take days to complete on my 12-core intel alder lake CPU) to wait for the preprocessing to be complete, let alone the training and evaluation clock time after that.

However, I'm still trying to wrap my head around the evaluation process, if you could help with that. princeton benchmark code you provided in measures.py. Is that inspired by a library code? Which mesh should I provide as a reference mesh path in the geoconv case? Does it only accept intrinsic mesh CNN (in the imcnn argument) or do other modes work too? I'm doing this for a course project, so I've switched to reproducing a similar paper's (MoNet) results for now but still need to implement this princeton evaluation that all Bronstein papers use.

andreasMazur commented 1 year ago

[...] I figured out that pyshot was causing the problem, here are the lines that made it possible to build pyshot in my conda env [...]

I have updated the mpi-faust example README.md in a way that emphasizes the dependencies of pyshot.

[...] I'm still trying to wrap my head around the evaluation process [...] princeton benchmark code you provided in measures.py [...]

The Princeton benchmark is a ubiquitous benchmark for the shape correspondence problem. If you want to learn more about it, I'd suggest you take a look into Section 8.2 of the following paper:

Kim, Vladimir G., Yaron Lipman, and Thomas Funkhouser. "Blended intrinsic maps." ACM transactions on graphics (TOG) 30.4 (2011): 1-12.

If you want to learn about the shape correspondence problem, take a look into the following survey:

Van Kaick, Oliver, et al. "A survey on shape correspondence." Computer graphics forum. Vol. 30. No. 6. Oxford, UK: Blackwell Publishing Ltd, 2011.

Generally speaking, the Princeton benchmark depicts the accuracy (y-axis) of predicted point correspondences for gradually increasing geodesic error thresholds (x-axis).

GeoConv's implementation of it (princeton_benchmark) expects a test set that provides query meshes, a trained IMCNN and the path to your reference mesh. Thereby, the IMCNN takes the query meshes and predicts points on the reference mesh for the points in your query meshes. These points are the predicted point correspondences. Now, geodesic errors are calculated between the ground truth correspondences and the predicted ones on the reference mesh.

Say that we deem predicted vertices to be correct if they are closer than a geodesic error of x to the ground truth vertices (setting the threshold). Let N be the total amount of vertices and n the amount of correct predictions, then the y-axis depicts n / N (y-axis) at x (x-axis).

Is that inspired by a library code?

While writing the code, I only focused on implementing the theory correctly. For this, I did not consider the code of any other library.

Which mesh should I provide as a reference mesh path in the geoconv case?

Take the one that you have referred to when calling the training-demo (reference_mesh_path ).

Does it only accept intrinsic mesh CNN (in the imcnn argument) or do other modes work too?

In the way princeton benchmark is written, it only accepts the IMCNNs and Datasets from GeoConv. If you use other people's code, a good starting point would be to adjust princeton benchmark's loop over the dataset and the way in which the data is fed into the model.

I'm doing this for a course project, so I've switched to reproducing a similar paper's (MoNet) results

Be aware that MoNet applies further post-processing on the network's result with Bayesian filters ('CTRL+F: Bayesian filter' in the paper). This is useful as the IMCNN's prediction can be a bit noisy. Outliers can simply be "averaged out" if all their neighbors are predicted correctly.

I hope I could help you!

Best regards, Andreas