reymond-group / tmap

A very fast visualization library for large, high-dimensional data sets.
http://tmap.gdb.tools
212 stars 29 forks source link

Installation from pip #23

Open cthoyt opened 3 years ago

cthoyt commented 3 years ago

It looks like there are some C++ dependencies in tmap. Would it be possible to build some wheels and distribute them on PyPI? I'm personally not very happy to use conda as a crutch to installing things, as it introduces a manual step installation that works pretty differently on system to system.

kr-hansen commented 2 years ago

Wanted to post a follow-up to this. What is the gap preventing this from getting put on PyPI? Is it knowledge? Would you be open to a PR adding that?

daenuprobst commented 2 years ago

Dear @kr-hansen and others, I will have time to work on the project towards the end of May for a couple of weeks. Getting it to be redistributable using pip is one of the things I want to look into.

The main issues are mostly time and getting it to build for different platforms (anylinux, macos, windows).

kr-hansen commented 2 years ago

Sounds good. As an FYI, I was able to get it pip installed using the following commands. I imagine the hardest part will be working out the OGDF installation.

# Pip install requirements for building ogdf
pip install cmake "pybind11[global]" ogdf-python

# Install/Compile ogdf
git clone https://github.com/ogdf/ogdf.git
mkdir ogdf/ogdf_out_of_source_build
mkdir ogdf_install
cd ogdf/ogdf_out_of_source_build ; cmake ../ -DCMAKE_INSTALL_PREFIX=/hex/ogdf_install -DCMAKE_CXX_FLAGS=-fPIC
cd ogdf/ogdf_out_of_source_build ; cmake --build . --target install

# Pip install TMap from current repo
pip install git+https://github.com/reymond-group/tmap.git#subdirectory=tmap

After doing that, I'm able to import tmap and use it out of the box. This is all on Linux, but I imagine it should be pretty portable once the OGDF portion is worked out.

N-Coder commented 2 years ago

I was working on getting the OGDF itself to Conda (see here), but I guess if one can get the packaging right pip would also be a good place for distribution. I'd be happy to work with you on getting this to work and also make this an official distribution for the OGDF. Once that is done, building tmap and pulling in the OGDF as dependency should be comparatively easy. The CMake-based build process for the OGDF should also be easy to handle, my problem is currently how to get this into a setup.py controlled package and how to provide correct binary builds for all systems with stuff like anylinux and the like.

PS: If you are using the OGDF from python, `ogdf-python might be interesting for you.

daenuprobst commented 2 years ago

I managed to get both ogdf and tmap into a single wheel (see the development branch action artifacts). The only issue I have to work out is a SIGILL erro on some linux environments (e. g. on google colab). I hope to get this fixed later this week. The packages are deployed on test.pypi if anyone wants to give it a go.

Sorry for not adding links, I'm writing from the github mobile app and it's a bit of a pain 😊

daenuprobst commented 2 years ago

Update: Had to build OGDF with -DOGDF_MEMORY_MANAGER=MALLOC_TS to get it working (@N-Coder, see pyproject.toml and wheels.yml in actions to see how to get packages for manylinux, win and macos).

pip install -i https://test.pypi.org/simple/ tmap==1.0.8 now works everywhere I've tested it.

As I've deployed it from the development branch, it already has the new API, e.g.: data = random_vectors(1000, dims = 2048)

te = tm.embed(
    data,
    layout_generator=tm.layout_generators.BuiltinLayoutGenerator(),
    keep_knn=True,
)

tm.plot(
    te,
    show=True,
    line_kws={"linestyle": "--", "color": "gray"},
    scatter_kws={"s": 5},
)
N-Coder commented 2 years ago

The SIGILL probably indicates some CPU instruction that is available on the PC you built the wheel on, but not on the Cloud machines. This is because the OGDF release build uses some CPU-specific instructions and might also pass parameters like -mtune=native, but I'm not 100% sure about this. Changing the memory manager to a thread-safe locking one might prevent some compiler optimizations that use these instructions (or are you using any form of multi-threading?). I'll have a look into how we could support portable OGDF builds. To get the most performance, locally building the source distribution instead of using prebuilt wheels would still be better, but also a lot more set-up work, so wheels would still be nice. Thanks a lot for the references / templates!

daenuprobst commented 2 years ago

I did some more debugging on the SIGILL issue and it seems like the missing instruction set extensions are the AVX-512 ones (VL and/or F, as SIGILL is thrown on VPBROADCASTQ. This made me wonder what kind of CPUs they are running on colab and it turns out it's AMD Epycs based on < ZEN4, and they don't support AVX-512. So I guess a -mno-avx512f would do the trick. I'll do some building over the weekend...

daenuprobst commented 2 years ago

Quick update: Adding -mno-avx512f to the build flags fixes the SIGILL issue on the cloud services. I will check the performance impacts on TMAP next.

N-Coder commented 2 years ago

But that probably still only fixes the issues for some systems, right? So in a generic ogdf-python package I assume we should go for -march=x86-64 / aarch64 / ppc64le according to the available manylinux platforms, as this also makes the binaries generic for those targeted platforms.

daenuprobst commented 2 years ago

Yes, especially with M1 et al becoming more common. I didn't have time yet to look at the performance--what do you reckon the impact is?

N-Coder commented 2 years ago

Honestly, I have no clue there. :sweat_smile: I guess except for SSE3 we are not actively using any specific optimizations, but I also don't know where SSE3 is actually used. Still, the biggest difference is probably due to the fewer optimizations the compiler can make, although I also don't know how bad that is. For best production performance, one should probably build a single statically-linked and local-system-optimized binary with link-time optimizations turned on, but a system-independent prebuilt shared library should hopefully also do the thing.

N-Coder commented 2 years ago

Note: I'm now also building wheels with just the OGDF shared library here. Unfortunately, there is still some weird unrelated issue with our C++ bindings on Windows, so the tests keep failing. Still, the wheels produced there should be good, but I'm waiting for all tests to pass to move the release from the testing to the actual PyPI.

I'm using PyPA's new hatch build system, which makes for a very nice and clean pyproject.toml file without the (at least for me) hard-to-understand setuptools code in setup.py (that file is no longer required for hatch). The actual cmake build happens, nicely isolated, in a dedicated build hook, so this build also works outside of cibuildwheel (even when doing pip install . or python -m build). The march=native problems are circumvented by simply commenting out the line that sets this flag here, so I could stick with the more performant default memory manager POOL_TS.

Still, I'm not sure whether you want to depend on this, as static linking is probably preferable if you are after the most performance.

daenuprobst commented 2 years ago

Thanks for the info, I have been a bit busy preparing papers for conferences, so I'll take a look once I can get back to maintaining TMAP.

Hatch looks really cool--I haven't heard about it. I'll give it a go for my next project. Thanks for identifying the line, I'll use this in the next version of TMAP. I'll stick to static linking for the time being (depending on whether I can find a Master's student for a project idea I have, this might change though--btw, you don't have a surplus of students interested in graph-viz by any chance? 😉 )

thegodone commented 2 years ago

Hi guys, any update to build tmap development branch using a docker ubuntu image running on a M1 machine ?

#38 [molmap-master-backend 25/27] RUN pip install git+https://github.com/reymond-group/tmap.git@development
#38 41.43   Running command git clone --filter=blob:none --quiet https://github.com/reymond-group/tmap.git /tmp/pip-req-build-dnky6pqa
#38 41.43   Running command git checkout -b development --track origin/development
#38 41.43   Switched to a new branch 'development'
#38 41.43   Branch 'development' set up to track remote branch 'development' from 'origin'.
#38 41.43   error: subprocess-exited-with-error
#38 41.43   
#38 41.43   × Building wheel for tmap-viz (pyproject.toml) did not run successfully.
#38 41.43   │ exit code: 1
#38 41.43   ╰─> [112 lines of output]
#38 41.43       running bdist_wheel
#38 41.43       running build
#38 41.43       running build_py
#38 41.43       creating build
#38 41.43       creating build/lib.linux-aarch64-cpython-38
#38 41.43       creating build/lib.linux-aarch64-cpython-38/tmap
#38 41.43       copying src/tmap/plotting.py -> build/lib.linux-aarch64-cpython-38/tmap
#38 41.43       copying src/tmap/embedding.py -> build/lib.linux-aarch64-cpython-38/tmap
#38 41.43       copying src/tmap/__init__.py -> build/lib.linux-aarch64-cpython-38/tmap
#38 41.43       creating build/lib.linux-aarch64-cpython-38/tmap/layout_generators
#38 41.43       copying src/tmap/layout_generators/annoy_layout_generator.py -> build/lib.linux-aarch64-cpython-38/tmap/layout_generators
#38 41.43       copying src/tmap/layout_generators/base_layout_generator.py -> build/lib.linux-aarch64-cpython-38/tmap/layout_generators
#38 41.43       copying src/tmap/layout_generators/__init__.py -> build/lib.linux-aarch64-cpython-38/tmap/layout_generators
#38 41.43       copying src/tmap/layout_generators/builtin_layout_generator.py -> build/lib.linux-aarch64-cpython-38/tmap/layout_generators
#38 41.43       creating build/lib.linux-aarch64-cpython-38/tmap/helpers
#38 41.43       copying src/tmap/helpers/set_defaults.py -> build/lib.linux-aarch64-cpython-38/tmap/helpers
#38 41.43       copying src/tmap/helpers/__init__.py -> build/lib.linux-aarch64-cpython-38/tmap/helpers
#38 41.43       creating build/lib.linux-aarch64-cpython-38/tmap/core
#38 41.43       copying src/tmap/core/tmap_embedding.py -> build/lib.linux-aarch64-cpython-38/tmap/core
#38 41.43       copying src/tmap/core/__init__.py -> build/lib.linux-aarch64-cpython-38/tmap/core
#38 41.43       copying src/tmap/core/line.py -> build/lib.linux-aarch64-cpython-38/tmap/core
#38 41.43       running egg_info
#38 41.43       writing src/tmap_viz.egg-info/PKG-INFO
#38 41.43       writing dependency_links to src/tmap_viz.egg-info/dependency_links.txt
#38 41.43       writing requirements to src/tmap_viz.egg-info/requires.txt
#38 41.43       writing top-level names to src/tmap_viz.egg-info/top_level.txt
#38 41.43       reading manifest file 'src/tmap_viz.egg-info/SOURCES.txt'
#38 41.43       writing manifest file 'src/tmap_viz.egg-info/SOURCES.txt'
#38 41.43       running build_ext
#38 41.43       Traceback (most recent call last):
#38 41.43         File "/opt/conda/envs/molmap-backend/bin/cmake", line 5, in <module>
#38 41.43           from cmake import cmake
#38 41.43       ModuleNotFoundError: No module named 'cmake'
#38 41.43       /tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/config/_apply_pyprojecttoml.py:103: _WouldIgnoreField: 'authors' defined outside of `pyproject.toml` would be ignored.
#38 41.43           !!
#38 41.43       
#38 41.43       
#38 41.43           ##########################################################################
#38 41.43           # configuration would be ignored/result in error due to `pyproject.toml` #
#38 41.43           ##########################################################################
#38 41.43       
#38 41.43           The following seems to be defined outside of `pyproject.toml`:
#38 41.43       
#38 41.43           `authors = 'Daniel Probst'`
#38 41.43       
#38 41.43           According to the spec (see the link below), however, setuptools CANNOT
#38 41.43           consider this value unless 'authors' is listed as `dynamic`.
#38 41.43       
#38 41.43           https://packaging.python.org/en/latest/specifications/declaring-project-metadata/
#38 41.43       
#38 41.43           For the time being, `setuptools` will still consider the given value (as a
#38 41.43           **transitional** measure), but please note that future releases of setuptools will
#38 41.43           follow strictly the standard.
#38 41.43       
#38 41.43           To prevent this warning, you can list 'authors' under `dynamic` or alternatively
#38 41.43           remove the `[project]` table from your file and rely entirely on other means of
#38 41.43           configuration.
#38 41.43       
#38 41.43       
#38 41.43       !!
#38 41.43       
#38 41.43         warnings.warn(msg, _WouldIgnoreField)
#38 41.43       Traceback (most recent call last):
#38 41.43         File "/opt/conda/envs/molmap-backend/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 363, in <module>
#38 41.43           main()
#38 41.43         File "/opt/conda/envs/molmap-backend/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 345, in main
#38 41.43           json_out['return_val'] = hook(**hook_input['kwargs'])
#38 41.43         File "/opt/conda/envs/molmap-backend/lib/python3.8/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 261, in build_wheel
#38 41.43           return _build_backend().build_wheel(wheel_directory, config_settings,
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 412, in build_wheel
#38 41.43           return self._build_with_temp_dir(['bdist_wheel'], '.whl',
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 397, in _build_with_temp_dir
#38 41.43           self.run_setup()
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 483, in run_setup
#38 41.43           super(_BuildMetaLegacyBackend,
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 335, in run_setup
#38 41.43           exec(code, locals())
#38 41.43         File "<string>", line 91, in <module>
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/__init__.py", line 87, in setup
#38 41.43           return distutils.core.setup(**attrs)
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup
#38 41.43           return run_commands(dist)
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
#38 41.43           dist.run_commands()
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 968, in run_commands
#38 41.43           self.run_command(cmd)
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 1217, in run_command
#38 41.43           super().run_command(command)
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
#38 41.43           cmd_obj.run()
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 299, in run
#38 41.43           self.run_command('build')
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
#38 41.43           self.distribution.run_command(command)
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 1217, in run_command
#38 41.43           super().run_command(command)
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
#38 41.43           cmd_obj.run()
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 132, in run
#38 41.43           self.run_command(cmd_name)
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
#38 41.43           self.distribution.run_command(command)
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/dist.py", line 1217, in run_command
#38 41.43           super().run_command(command)
#38 41.43         File "/tmp/pip-build-env-vuui6xy_/overlay/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
#38 41.43           cmd_obj.run()
#38 41.43         File "<string>", line 26, in run
#38 41.43         File "/opt/conda/envs/molmap-backend/lib/python3.8/subprocess.py", line 415, in check_output
#38 41.43           return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
#38 41.43         File "/opt/conda/envs/molmap-backend/lib/python3.8/subprocess.py", line 516, in run
#38 41.43           raise CalledProcessError(retcode, process.args,
#38 41.43       subprocess.CalledProcessError: Command '['cmake', '--version']' returned non-zero exit status 1.
#38 41.43       [end of output]
#38 41.43   
#38 41.43   note: This error originates from a subprocess, and is likely not a problem with pip.
#38 41.43   ERROR: Failed building wheel for tmap-viz
#38 41.43 ERROR: Could not build wheels for tmap-viz, which is required to install pyproject.toml-based projects
#38 41.43 
#38 41.43 ERROR conda.cli.main_run:execute(49): `conda run /bin/bash -c pip install git+https://github.com/reymond-group/tmap.git@development` failed. (See above for error)
#38 41.43 Collecting git+https://github.com/reymond-group/tmap.git@development
#38 41.43   Cloning https://github.com/reymond-group/tmap.git (to revision development) to /tmp/pip-req-build-dnky6pqa
#38 41.43   Resolved https://github.com/reymond-group/tmap.git to commit 90befdf339660e2af323c48fb55d232265ee2f8b
#38 41.43   Installing build dependencies: started
#38 41.43   Installing build dependencies: finished with status 'done'
#38 41.43   Getting requirements to build wheel: started
#38 41.43   Getting requirements to build wheel: finished with status 'done'
#38 41.43   Preparing metadata (pyproject.toml): started
#38 41.43   Preparing metadata (pyproject.toml): finished with status 'done'
#38 41.43 Requirement already satisfied: matplotlib in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from tmap-viz==1.0.17) (3.2.2)
#38 41.43 Requirement already satisfied: annoy~=1.17.0 in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from tmap-viz==1.0.17) (1.17.1)
#38 41.43 Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from matplotlib->tmap-viz==1.0.17) (1.4.2)
#38 41.43 Requirement already satisfied: numpy>=1.11 in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from matplotlib->tmap-viz==1.0.17) (1.21.3)
#38 41.43 Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from matplotlib->tmap-viz==1.0.17) (3.0.9)
#38 41.43 Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from matplotlib->tmap-viz==1.0.17) (2.8.2)
#38 41.43 Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from matplotlib->tmap-viz==1.0.17) (0.11.0)
#38 41.43 Requirement already satisfied: six>=1.5 in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from python-dateutil>=2.1->matplotlib->tmap-viz==1.0.17) (1.16.0)
#38 41.43 Building wheels for collected packages: tmap-viz
#38 41.43   Building wheel for tmap-viz (pyproject.toml): started
#38 41.43   Building wheel for tmap-viz (pyproject.toml): finished with status 'error'
#38 41.43 Failed to build tmap-viz
#38 41.43 
#38 ERROR: executor failed running [conda run -n molmap-backend /bin/bash -c pip install git+https://github.com/reymond-group/tmap.git@development]: exit code: 1
------
 > [molmap-master-backend 25/27] RUN pip install git+https://github.com/reymond-group/tmap.git@development:
#38 41.43 Requirement already satisfied: numpy>=1.11 in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from matplotlib->tmap-viz==1.0.17) (1.21.3)
#38 41.43 Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from matplotlib->tmap-viz==1.0.17) (3.0.9)
#38 41.43 Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from matplotlib->tmap-viz==1.0.17) (2.8.2)
#38 41.43 Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from matplotlib->tmap-viz==1.0.17) (0.11.0)
#38 41.43 Requirement already satisfied: six>=1.5 in /opt/conda/envs/molmap-backend/lib/python3.8/site-packages (from python-dateutil>=2.1->matplotlib->tmap-viz==1.0.17) (1.16.0)
#38 41.43 Building wheels for collected packages: tmap-viz
#38 41.43   Building wheel for tmap-viz (pyproject.toml): started
#38 41.43   Building wheel for tmap-viz (pyproject.toml): finished with status 'error'
#38 41.43 Failed to build tmap-viz
#38 41.43 
------
failed to solve: executor failed running [conda run -n molmap-backend /bin/bash -c pip install git+https://github.com/reymond-group/tmap.git@development]: exit code: 1
doublethefish commented 2 years ago

See #33 for a temporary workaround.