arq5x / gemini

a lightweight db framework for exploring genetic variation.
http://gemini.readthedocs.org
MIT License
317 stars 120 forks source link

GEMINI not picking up PYTHONPATH #448

Open georg-rath opened 9 years ago

georg-rath commented 9 years ago

When installing GEMINI into a custom location using setup.py I ran into the problem that it did not find itself, because apparently it ignores PYTHONPATH (see commit b504d5a1dcc1ffedc2844f251b016c52c54696ad and 5ad343ef81f3c81f723e2a346b72575f472e85d7). Is there a reason for doing that? It would be great to make this behaviour configurable when using setup.py to install it.

chapmanb commented 9 years ago

Georg; Thanks for the question and sorry for any confusion. By default, GEMINI does intentionally ignore PYTHONPATH. The goal is to only use the libraries installed by the python you're running GEMINI with. Without this in place you will accidentally inject libraries from other Python installations or local site-libraries which can often be incompatible. This is the source of a number of confusing bugs.

This helps us create an isolated GEMINI install that will work reliably across systems with a wide variety of system pythons and other local tweaks.

Even for a setup.py install in a standard Python, I'm a bit confused as to why you need to rely on the PYTHONPATH rather than installing the requirements within your system Python. Would you be able to explain more about your setup?

georg-rath commented 9 years ago

Sure, we are using Environment Modules (a nice implementation is Lmod) to manage software on our systems. It enables us to easily load and unload software in different versions by modifying the environment. Unfortunately it is not well known outside of HPC/cluster computing. In the case of python modules it works by modifying the PYTHONPATH environment variable. We try to keep the base python installation light and load additional python packages as separate modules.

chapmanb commented 9 years ago

Georg; Thanks for the additional details. I'm definitely familiar with modules and ignoring the PYTHONPATH is a response to working around issues caused by those setups. It's pretty easy to get an incompatible set of modules based on what people load, which causes the errors mentioned above.

The automated install (http://gemini.readthedocs.org/en/latest/content/installation.html#automated-installation) is well setup to deal with modules. It'll have an isolated python installation plus all the dependencies in one place that you can inject with just a PATH change and no PYTHONPATH hacking. Would that work for what you need?

If not, you can always remove the -Es flag from your GEMINI install as a custom workaround but I'm not sure this would be a good solution for most installs due to the potential for getting an inconsistent environment.

Sorry to not have a great solution but hope this helps some.

georg-rath commented 9 years ago

The reason of setting the PYTHONPATH is so that we can use the dependencies out of our module system. For now we solved it by removing the -Es flag. For us it would be cool if that would be an option in the setup.py installation (e.g. an option like --no-ignore-pythonpath). If you think that is a viable solution, I'd gladly open a PR for that.

dtrudg commented 8 years ago

:+1: for an option to not ignore pythonpath. We just hit the same issue installing on our HPC cluster where we are building up the environment based on modules. If the requirements are well specified we can ensure our modules provide them, and don't pollute.