CoExpNetViz / coexpnetviz-python

Internal coexpnetviz CLI used in Cytoscape app
GNU Lesser General Public License v3.0
1 stars 0 forks source link

It should just work everywhere, easily #3

Open timdiels opened 3 years ago

timdiels commented 3 years ago

In GitLab by @timdiels on Apr 3, 2019, 16:24

It's currently easy enough, but not entirely straightforward to install CoExpNetViz. There are still many things that could go wrong:

Conda and/or Docker may solve many of these issues. Docker is a bit of fuss for the user, but there exists documentation to install it, they should be able to make it work; it could be offered as a last resort. And just set matplotlib backend to Agg, always.

timdiels commented 3 years ago

In GitLab by @timdiels on Apr 3, 2019, 16:37

Another issue we came across using venv is that when the venv is installed at a deeply nested location, the shebang line of bin/coexpnetviz exceeds the max length for a shebang. Would that be fixed when using conda? Or should we just warn not to install it with too deep nesting? The error you would get in zsh is ...bad interpreter: /truncated/path/to/venv/bin/python.

timdiels commented 3 years ago

In GitLab by @timdiels on Apr 4, 2019, 12:47

We shouldn't rely on shell commands, tools such as sed. E.g. sed behaviour differs on Mac.

timdiels commented 3 years ago

In GitLab by @timdiels on Apr 4, 2019, 14:37

conda can take care of installing the python version we prefer. Miniconda offers nice installers for all the major platforms (Linux, Windows, Mac OS X), provides a nice shell to work in. If they already have another *conda (e.g. Anaconda) they can simply use that too of course. The recommended way to install in our single-user use case is to install as a user using the provided script, not using the package manager (which would install it system wide).

An environment.yml file is similar to a pip requirements.txt file, so with this CENV should install and work as intended in 99.99% of cases (granted we also make the code itself cross-platform). Docker is quite a heavy install just to install a tool but it could be provided as a last resort; I'm not making a docker until it turns out that environment.yml is insufficient (conda offers a Miniconda docker image). Conda specfiles aren't an option as they aren't cross platform. Try to specify the desired python version in the environment.yml.

Conda has a travis example, so we could learn from that to write a gitlab runner script to build conda packages with. Apparently it is possible to build conda from a setup.py more-or-less, see also the conda cookiecutter for python projects; that may hint at how to do so, it may also be a nicer base for our projects than simple-project.

timdiels commented 3 years ago

In GitLab by @timdiels on Apr 4, 2019, 14:44

When switching to conda, do we keep supporting pip too? I'd rather avoid dealing with issues of having trouble installing with pip. If we can hide the previous pip versions without them becoming outright unavailable (should still be able to pip install pkg==1.2.3 despite 1.2.3 being hidden), that would be a good idea too; just to clean up; though for google may want to leave a message pointing to conda (but without pip install pkg installing the pkg with just that message)?

timdiels commented 3 years ago

In GitLab by @timdiels on Oct 1, 2020, 22:59

In the end you'll only ever want to use this in cytoscape so really the only pkg we need to provide is the cytoscape plugin to the cytoscape store (or whatever it's called). CoExpNetViz can run client side just fine, it's not heavy on compute and the data could be included in the pkg (or downloaded on the side if need be). We can probably drop support for the website and simply make it redirect to the cytoscape plugin page. Users have not had access to a CLI or library for years so just a plugin is probably good enough.

Docker

In addition to whatever else we provide, we should provide a docker with cytoscape, with the plugin and any of its dependencies installed. The main point of the docker is that it will always work, on any platform, even 10 years later in case they want to try an old version for reproducibility.

If we offer a library, then they'll be able to find that library in here too.

Run python in Java approach

Cytoscape is java, but maybe we could include a cross-platform version of python and run our code with it in cytoscape... It would in fact have to include the equivalent of a venv as we need dependencies too, and if any deps are platform dependent then we should include them for each platform too. In fact, if a dep is too platform dependent (e.g. a glibc dependency), then we can't simply include binaries for every possible system out there; in that case we would instead need to ask them to install miniconda/anaconda and conda install as we planned to offer originally.

Running Python in Java would mean using Jython. Jython is just java so it can be included in a cytoscape plugin easily. Alas Jython is the equivalent of Python 2.7.2, they started work on Python 3 at some point but are nowhere near to finishing it, if ever. Python 2 will soon be phased out, but we can make do with old python 2 versions of our dependencies.

Jython finds python code on sys.path as usual, it also searches the java classpath. So all we need is to include the equivalent of a virtualenv for each potential platform and pick the right one with sys.path at runtime (I don't think you can provide a jar per platform as cytoscape app). There are tools for creating standalone python applications which include all dependencies and usually also python itself, the tricky part would be generating them for each platform and grabbing just the libraries as we include python ourselves, as jython.

Include an executable in the jar

Instead of jython we could include an executable (coexpnetviz cli) containing python, our library and its dependencies in the jar. Then hopefully we would be able to call the executable in the jar. This way we can use any python version. But:

Can we call that executable inside the jar? E.g. it might depend on other files which probably wouldn't work without first unpacking the jar somewhere, which we could do to a tmp directory.

Installers

Regardless of whether we use Jython or include an executable, we need something to pkg coexpnetviz with for inclusion in the jar. PyInstaller is the popular cross-platform one. Old favourites were py2exe (windows), freeze (linux), py2app (mac); but people really recommend PyInstaller instead of e.g. py2exe. cx_freeze was also mentioned by someone but its last commit is 3 years ago on sourceforge while PyInstaller's last commit was today; it also suffers the same limitation of PyInstaller, namely you need to run the build on each platform you want to create an executable for.

PyInstaller supports python3.5 - 3.7. You need to run it on window, mac and linux to create an executable for each; you end up with 3 executables which we can all include and finally our java code will just have to pick the right executable to call. It supports including dependencies, e.g. they mention matplotlib, which depends on numpy.

PyInstaller could be used for either approach. If you run it with jython, you could tell it create a directory with all the files it needs to run and then remove anything jython related because the jar already has that. But I'd probably go with python3.7 or so instead and bundle it into an executable per platform instead of taking the jython approach.

PyInstaller does not generate an installer, the executable will just run coexpnetviz-cli, that's good. By default it will open a console window, we don't want that. Instead we want to output multiple files and log to a directory (provided in the GUI); we don't need stdout/stderr.

Conda

If the above does not work, then our plugin should instruct users to install anaconda/miniconda and conda install -n coexpnetviz coexpnetviz or whatever it was called again. We can then conda run -n coexpnetviz coexpnetviz $args or so.

timdiels commented 3 years ago

In GitLab by @timdiels on Oct 1, 2020, 23:03

timdiels commented 3 years ago

In GitLab by @timdiels on Oct 1, 2020, 23:07

If the pyinstaller executable is too slow to start, you could let it run a simple http server to which you can make api calls on localhost. Then you can leave it running and occasionally ask it to compute something. Careful though, the port may already be in use, maybe by another cytoscape instance with a coexpnetviz plugin of the same or a different version; so you'd have to find an available port.

timdiels commented 3 years ago

In GitLab by @timdiels on Oct 1, 2020, 23:08

timdiels commented 3 years ago

In GitLab by @timdiels on Oct 17, 2020, 20:01

Edit: use conda, but not conda envs

timdiels commented 3 years ago

In GitLab by @timdiels on Oct 17, 2020, 23:14

Setting up and running pyinstaller:

conda create -n coexpnetviz-pyinstaller python==3.8
conda activate coexpnetviz-pyinstaller
pip install #pytil, varbio and coexpnetviz, but not editably! Better not include dev deps
pip install pyinstaller
cd coexpnetviz repo
./pyinstaller.sh

Pyinstaller:

Edit: I abandoned the pyinstaller approach, we'd have to distribute way too many versions. Conda is much easier to distribute, more likely to work on different platforms and likely not too difficult for users as conda is popular in research anyway.

timdiels commented 3 years ago

In GitLab by @timdiels on Oct 18, 2020, 05:11

github actions offers a 64 bit windows, osx and linux VM to build on. This would mean 32 bit users would be out of luck with the pyinstaller approach. I also still worry that e.g. a glibc incompatibility could cause the executable to fail to run. Each exe comes with just 1 version of numpy for 1 specific platform after all. The numpy pypi page lists downloads for 32/64 bit windows/linux, 64 bit osx and even multiple versions of linux; I'm guessing multiple depending on how ancient of a linux distro it is.

An executable built on debian 10 (2019) still works on a fedora 25 (2016), tested with the example input of 2 species. So admittedly it works pretty well, just no 32 bit or ancient linux (we could point those users to the docker instead). And a bit complexer of a build because it needs to be repeated on each target platform.

With conda we can download the python version we want and numpy is also available for all the platforms. Given that conda install is well documented we may get away with asking users to install it; after all they didn't mind installing cytoscape.

With the conda approach cytoscape could manage a coexpnetviz env for the user. Check coexpnetviz --version and if it's not compatible or it does not exist, downgrade/upgrade/install coexpnetviz from an env file.

timdiels commented 3 years ago

In GitLab by @timdiels on Oct 24, 2020, 18:06

mentioned in commit f47648349c4697bcabe1208c7135a369439e9553

timdiels commented 3 years ago

In GitLab by @timdiels on Oct 24, 2020, 18:06

mentioned in commit 5ffe13d4f52064f69e2865370ac2beae3e8635ea