drprojects / superpoint_transformer

Official PyTorch implementation of Superpoint Transformer introduced in [ICCV'23] "Efficient 3D Semantic Segmentation with Superpoint Transformer" and SuperCluster introduced in [3DV'24 Oral] "Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering"
MIT License
508 stars 65 forks source link

Install script makes seemingly incorrect assumptions about PGEOF's environment #102

Closed gspr-sintef closed 1 week ago

gspr-sintef commented 2 months ago

In an effort to see if issue #101 is caused by incompatible Python versions or not, I have set up a clean environment that uses the included install.sh verbatim. Running python src/train.py experiment=semantic/kitti360 then results in

Traceback (most recent call last):
  File "src/train.py", line 49, in <module>
    from src import utils
  File "/home/somebody/superpoint-transformer/src/__init__.py", line 3, in <module>
    import src.datasets
  File "/home/somebody/superpoint-transformer/src/datasets/__init__.py", line 2, in <module>
    from .base import *
  File "/home/somebody/superpoint-transformer/src/datasets/base.py", line 20, in <module>
    from src.transforms import NAGSelectByKey, NAGRemoveKeys, SampleXYTiling, \
  File "/home/somebody/superpoint-transformer/src/transforms/__init__.py", line 7, in <module>
    from .point import *
  File "/home/somebody/superpoint-transformer/src/transforms/point.py", line 4, in <module>
    from pgeof import pgeof
  File "/home/somebody/miniconda3/envs/spt/lib/python3.8/site-packages/pgeof/__init__.py", line 1, in <module>
    from .pgeof_ext import (
ImportError: /home/somebody/miniconda3/envs/spt/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /home/somebody/miniconda3/envs/spt/lib/python3.8/site-packages/pgeof/pgeof_ext.cpython-38-x86_64-linux-gnu.so)

I'm using Miniconda 23.3.1 in a clean environment with Python 3.8.19, running the provided install script as is. The script runs successfully. I've tried both the current latest commit (bb31689b1bf76d62112dc4d2094cb82dae3773ac) and one from before the latest PGEOF changes (6b11e8e1cfc07a54dc34442e9948ed88fe1126af), both with the same result.

Rebuilding PGEOF from source results in the same problem.

gspr-sintef commented 2 months ago

It seems that PGEOF's (terribly opaque) build system uses the system's libstdc++ without consideration of the conda environment. This is of course a PGEOF issue, but probably also an SPT issue since its install script makes assumptions about PGEOF's build system.

gspr-sintef commented 2 months ago

Workaround for people facing the same issue: Separately build PGEOF in an environment with a sufficiently old libstdc++ (i.e. one at least as old as whatever is provided in the conda environment used by SPT), and change SPT's install script to install that wheel instead of the default one.

drprojects commented 2 months ago

Could you please create an issue on https://github.com/drprojects/point_geometric_features so we can address this over there ?

drprojects commented 2 months ago

A simpler workaround that compiling pgeof: https://github.com/rstudio/reticulate/issues/1282#issuecomment-1272473007

This is a bit hacky, but can be fixed by manually modifying where your /home/somebody/miniconda3/envs/spt/bin/../lib/libstdc++* symlinks point.

But I agree this is not a final solution, we may need to fix something on https://github.com/drprojects/point_geometric_features side...

drprojects commented 2 months ago

@rjanvier do you know how to make pgeof more aware of the conda environment's libstdc++* ?

rjanvier commented 2 months ago

I will try to reproduce it and debug this but a priori I feel weird that the scikit build system used by pgeof does not handle that out of the box...

drprojects commented 2 months ago

Thanks for giving it a try. I confirm that I could reproduce the same error on my end, when installing on a fresh conda env with our latest pgeof.

rjanvier commented 2 months ago

Confirmed too, it slip off the radar because I use SPT on a virtualenv on my side. it's ok with precompiled package (i.e.pip install pgeof), but I generated them for python 3.9+. I will try to fix this anyway.

gspr-sintef commented 2 months ago

Could you please create an issue on https://github.com/drprojects/point_geometric_features so we can address this over there ?

Yeah I hope to get around to that soon. Ideally I'd also propose a fix, but I have no knowledge on the build system used, so I found it very hard to control how it in turn invokes CMake.

gspr-sintef commented 2 months ago

I will try to reproduce it and debug this but a priori I feel weird that the scikit build system used by pgeof does not handle that out of the box...

If you are interested, I could take a stab at attempting to convert pgeof to a more flexible plain CMake-based build (rather than CMake wrapped in scikit-build)?

@drprojects : I'd just like to also say thanks for your extremely quick responses to both my recent bug reports. It's very impressive, and much appreciated!

drprojects commented 2 months ago

@gspr-sintef to be honest, although I did create thepgeof library, I would say as of late @rjanvier has become the expert regarding its compilation recipe. So I will wait a bit and see if he can find an elegant solution 😆

In the meantime if you need a dirty workaround, I managed to fix the issue with my above-mentionned hack: adjusting your conda symlinks to point to the correct libstdc++* on your systems.

PS: @gspr-sintef if you ❤️ or use this project, don't forget to give it a ⭐, it means a lot to us !

rjanvier commented 2 months ago

Hi @gspr-sintef. CMake used by sk-build-tool is pure CMake, so I guess you don't have to change anything apart in the CMakeLists.txt. I think it's flexible enough :D

After investigation I would say in some aspects this behavior is normal because nothing tells us in conda env variables that we have to pick the libs in /condenv/lib (LD_LIBRARY_PATH is not redefined for example), and you will have to handle this with any CMake based file, no matter if it's called by sk-build-tools... I have posted a question in sk-build-core issue tracker in order to ask if there are some good practices in our specific case. Meanwhile, a quick alternative is to repair the wheel with auditwheels or to install a more recent libstc++ lib in your environement from conda-forge (libstdc++ is forward compatible) . Please open an issue in pgeof issue tracker in order to continue this discussion, thanks!

gspr-sintef commented 2 months ago

Hi @gspr-sintef. CMake used by sk-build-tool is pure CMake, so I guess you don't have to change anything apart in the CMakeLists.txt. I think it's flexible enough :D

Sure, but as far as I understand, it encapsulates CMake without making the full flexibility of CMake available to the user (i.e. to the person building the package).

After investigation I would say in some aspects this behavior is normal because nothing tells us in conda env variables that we have to pick the libs in /condenv/lib (LD_LIBRARY_PATH is not redefined for example), and you will have to handle this with any CMake based file,

CMake will typically let the user (i.e. the person calling the cmake tool during build, who doesn't have to be the author of the package or of its CMakeLists.txt) specify where to look for libraries. As far as I can tell, sk-build-tools hides away this functionality.

rjanvier commented 2 months ago

of course you can pass option to encapsulated cmake, please read this https://github.com/scikit-build/scikit-build-core?tab=readme-ov-file#configuration

gspr-sintef commented 2 months ago

of course you can pass option to encapsulated cmake, please read this https://github.com/scikit-build/scikit-build-core?tab=readme-ov-file#configuration

Aha! I only saw the part that required modifying pyproject.toml (so that wouldn't be user-controllable at build-time) – I see now that sk-build does also consume a CMAKE_ARGS environment variable. Thanks for pointing that out!

biophase commented 2 months ago

A simpler workaround that compiling pgeof: rstudio/reticulate#1282 (comment)

This is a bit hacky, but can be fixed by manually modifying where your /home/somebody/miniconda3/envs/spt/bin/../lib/libstdc++* symlinks point.

But I agree this is not a final solution, we may need to fix something on https://github.com/drprojects/point_geometric_features side...

Hello, I got the same error but could not quite understand where the 'correct' version of libstdc++ should point to. For now the following solution worked for me:

conda install -c conda-forge libstdcxx-ng
drprojects commented 2 months ago

@biophase indeed the workaround you used indeed seems to the simplest solution for now.

We are waiting to see if @rjanvier's question sk-build-core issue tracker allows a better way for pgeof to compile using the libstdc++ within an activated conda environment.

Depending on this, I may or may not integrate a conda install -c conda-forge libstdcxx-ng in the install.sh script in the future.

drprojects commented 1 week ago

I decided to opt for integrating the conda-install fix in install.sh so we can close this issue: https://github.com/drprojects/superpoint_transformer/commit/d940e8bc4be8a79afb0ab0d5255693ef2c5f1df8