SINTEF / dlite

DLite - a lightweight data-centric framework for semantic interoperability
https://SINTEF.github.io/dlite
MIT License
29 stars 6 forks source link

No respect for virtual environment system paths #178

Closed CasperWA closed 2 years ago

CasperWA commented 2 years ago

It seems dlite does not respect the lib paths within a virtual environment, forcing loading from ~/.local on UNIX systems instead of retrieving where the virtual environment has it's lib, bin, share, etc. folders.

I've tried amending the python/setup.py:CMAKE_ARGS variable for Linux systems without luck.

Steps to reproduce:

The last step fails, complaining the "'yaml' api cannot be found". It then references the ~/.local/... folders where it's trying to find the yaml driver, and failing.

Expected behaviour: After installation the read CSV example should work.

pscff commented 2 years ago

Not sure what is wrong here, this works for me:

$ conda create --name=py37dlite_issue178 python=3.7
$ conda activate py37dlite_issue178
$ pip install git+https://github.com/SINTEF/dlite.git#subdirectory=python

go to the directory where the readcsv.py example is and run python readcsv.py. Complains only about missing tables. The additional CMake variables should be unnecessary I guess, since dlite looks for Python via find_package. Have you activated your environment?

CasperWA commented 2 years ago

I am using virtualenv in a Linux Distro (Ubuntu running via Windows Subsystem for Linux (WSL) in this case), not conda in Windows. We cannot/should not use conda at SINTEF since it's proprietary software and they may at any time setup a paywall, etc. etc.

I need a development environment setup. So I have cloned the repository down locally and want to have it installable via pip so I can test new plugin developments and such. So an installation from github.com is not desirable.

A normal workflow for me looks like (using virtualenvwrapper):

$ mkproject --python pythonX.Y my_virtualenv
$ workon my_virtualenv
(my_virtualenv) $ git clone git@github.com:SINTEF/dlite.git
(my_virtualenv) $ cd dlite
(my_virtualenv) dlite$ pip install -e .
...

And then develop, test and run. DLite is not built for editable installations, which is fair enough. I'm not yet sure how plugins are loaded/tested other than having to pip install it locally over and over to "install" the plugins under the system's/virtual environment's share/dlite folder or if one can "cheat" by running it from the repository directory using relative paths to load the plugins (I think I saw something that suggests this).

In any case - I can make it install from the local clone, but I cannot make it use the virtual environment's system paths to load the plugins - essentially.

Also - concerning the tables, I took that into account in my OP as well, installing it before running the example...

pscff commented 2 years ago

OK, I see. The install via pip should not depend an the type of virtual Python environment being used (conda, venv, virtualenvwrapper). Good to see that it works in your case.

Plugins can be loaded from other locations by setting environment variables, as described here:

https://github.com/SINTEF/dlite/blob/master/doc/environment-variables.md#path-handling-when-using-the-pre-packaged-wheel-linux-windows

(... need to fix the formatting in that one ...)

Or in Python using dlite.python_mapping_plugin_path.append(...) and related methods (See tests for examples).

If the package is installed via pip, a few plugins are bundled and are located in the lib/site-packages/dlite/share/dlite folders at the moment. If you want an editable install (where you also re-build the C-code from time to time) and test things in Python, I would

CasperWA commented 2 years ago

This is not a bad idea - indeed it (and the Dockerfile) gave me an idea on how to have a workable installation straight after doing pip install. I tested it locally and it worked! :)

The main point is to put the share and .so files in their proper place, in the relative terms of the current system (be it the root system or virtual environment). Now the challenge is to come up with a good solution, since two things need to happen:

  1. Update LD_LIBRARY_PATH and set DLITE_ROOT.
  2. Handle .so files and share directory.

Finally, if it's in a virtual environment (using virtualenv or similar), one wants to remove the changes to LD_LIBRARY_PATH when deactivating the virtual environment, while also unsetting DLITE_ROOT. Concerning DLITE_ROOT, @jesper-friis is convincing me that adding the -DCMAKE_INSTALL_PREFIX option is the better way, since it should (in theory) affect the same parts of the code as setting DLITE_ROOT should at build time versus at run time. The main question here is however not whether to do one or the other, but where to point it; it could be pointed to lib/pythonX.Y/site-packages/dlite or we could move the *.so files and share folder to the location returned in Python by sys.prefix and set that as the location?

pscff commented 2 years ago

Say the .so file we want to place into the filesystem is for a new C-compiled storage plugin to be accessed from Python. We could follow the some route as already done for the pre-packaged plugins (such as dlite-plugins-json.so).

For the pre-packaged plugins, there is no need to adjust LD_LIBRARY_PATH or DLITE_ROOT. The .so file is placed by CMake (or setup.py using CMake) into the directory ./share/dlite/storage-plugins below the dlite Python package directory. This path is guaranteed to exists on both Windows/Linux and is independent of what virtual environment we are using. We should maybe be using another naming convention for the plugin-directory since a path like .../site-packages/dlite/share/dlite/storage-plugins is a bit awkward.

dlite-plugins-json.so must now be discoverable by dlite so it can be loaded. dlite uses the env-variable DLITE_STORAGES to facilitate searching. This env-variable is for example set when testing the dlite json storage plugin via CTest in the build directory. When pip-installing, the paths where dlite searches for its plugins are set in the __init__.py of the package and this by default includes the ./share/dlite/storage-plugins directory relative to the package install directory. No need to set any DLITE_STORAGES path, just pip install and it should find the plugin.

dlite can now discover and load the .so plugin, but the next complication is that the plugin also depends on dlite.so and dlite-utils.so. These files are placed by CMake directly into dlites package root directory because they need to be picked up once import dlite imports _dlite.pyd and pyd expects to load dlite.dll and dlite-utils.dll (Using SWIG extension names on Windows here). At the moment this is solved by baking the shared library search-path directly into dlite-plugins-json.so using RPATH instead of setting LD_LIBRARY_PATH because the latter doesnt work cross-platform.

This has become a long answer - but to summarize for your use-case: I would implement the plugin using the same machinery as for the pre-packaged plugins. I would take the ./storages/json plugin as a template and build the new extension from source. No need to use pip install or modify setup.py at this point. Once the plugin works, go to ./python and use setup.py as it is to build a wheel with your new plugin included, make a pip install in a virtual evironement or whatever you whish.

The other use-case is that we have a local dlite install (either via pip or installed from source or from a distro-package, or ...) and then want to compile a new C-extension. I guess this would best be done in a separate CMake project which uses the FindDlite.cmake file provided in the dlite repo to find and use all the paths needed (headers, libs) for compiling, linking and installing.

jesper-friis commented 2 years ago

@CasperWA, I tried to reproduce your issue with plugins using virtualenv on linux. For me

$ mkvirtualenv dlite
$ pip install --upgrade pip
$ cd dlite/python
$ pip install .
$ pip install tables
$ cd ../examples/read-csv
$ python readcsv.py

works fine. I think using RPATH mentioned by @pscff makes the trick for me. Have I missed your point, or is it working differently on my native Fedora linux and than on your Ubuntu on WSL?