Support for unmanaged, pre-existing virtualenvs?

gregsymons commented 6 years ago

On our agents, we have several python based tools that we install in virtualenvs while building the agent image. It would nice to be able to activate one of these virtualenvs using the withPythonEnv step.

I haven't fully grokked how this plugin works, but it seems like that should be possible, but it doesn't currently work out of the box.

cstarner commented 6 years ago

It should be possible to accomplish this; as it is, the plugin already attempts to determine the significance of the string argument passed to it (right now, it attempts to pair it with a ShiningPanda tool installation, otherwise assuming it is a literal path to a particular execution).

The only issue would be how we signify whether the string argument is a virtualenv or not. In order for this to work, we would need to pass it the base folder of the virtualenv; we could just look for a ending slash (OS appropriate), and then verify that the directory is a virtualenv?

larsskj commented 6 years ago

We need this as well. Much to my surprise, whilst the ShiningPanda plugin verifies that the virtualenv directory exists, withPythonEnv actually creates a new virtualenv in the jobs workspace? We, too, have preconfigured environments - and we need environments build with the --system-site-packages as we use special Python bindings for the Ceph disksystem, for VMware, etc. etc.

cstarner commented 6 years ago

withPythonEnv does create a virtualenv in the jobs workspace if one is not found. For this functionality, we would override that behavior.

I can also add an argument to withPythonEnv that would allow for --system-site-packages. It would be false by default, but could be specified to true.

larsskj commented 6 years ago

@cstarner Whilst that would be a great enhancement, it would be even better if the plugin would reuse a virtualenv created outside the workspace. In this case, it should simply activate the virtualenv, not create a new one. I use this functionality a lot in order to maintain and share a virtualenv across a lot of similar jobs. As an example, I have setup an autoupdate system that use Jenkins to maintain hundreds of Debian servers. It would take a lot of time to rebuild the virtualenv every time a job runs, so I use a shared environment and maintain it outside the Jenkinsfile. The location of such an environment could be /var/lib/jenkins/venv-autoupdate.

cstarner commented 6 years ago

@larsskj That is absolutely the plan for this feature. What I am thinking is that we will be able to specify either arbitrary paths or relative paths to a virtualenv storage folder (like /var/lib/jenkins/venv-autoupdate in your proposal). For the latter, I can either hard code it, or allow it to be configurable through a global plugin setting. I'm leaning towards the latter, it could help with issues that arise from having virtualenvs that contain to many path characters.

And to be clear, the plugin as is stands does reuse virtualenvs, even between builds. It only generates a new virtualenv if it the folder it was looking for doesn't exist. That is why they are kept in the workspace which, by default, persists in between builds, and is intended to house "any files generated by the build itself", among other things (reference here: https://jenkins-le-guide-complet.github.io/html/sec-hudson-home-directory-contents.html). In practice, this workspace is often cleared in Jenkinsfiles (with the cleanWs()) command, but standard practice is to use the workspace to house these kinds of files.

larsskj commented 6 years ago

Do you seriously mean that the plugin always reuses virtualenv? What if I move the job execution from one node to another? If I have a group of nodes that can execute the job? If these nodes are differently configured - then it would be dangerous and very error-prone to reuse the same virtualenv? And, even more complicated: All our building nodes are running as Docker containers that are created and destroyed again after each build. They always start with a naked workspace. This means that we would constantly have to rebuild the virtualenv whenever we want to build something. This is sometimes desirable but in other situations a waste of time.

cstarner commented 6 years ago

Within the plugin, when encountering a withPythonEnv block, the plugin will either generate a virtualenv within the workspace with a name derived from the string argument to withPythonEnv, unless it sees such a folder has already been generated, in which case it will not recreate the virtualenv, and reactivate it. A different string argument will result in creating or activating a different virtualenv.

If the job executes on different nodes, then the workspace folder for each node is different, as I understand it. Following the same process above, we would generate a new virtualenv based on the string argument. Differences in node configuration would only come up in the python installations used to generate the virtualenv are different (i.e. if withPythonEnv('/usr/bin/python3') pointed to two different version of python3 on two different nodes).

Keeping it in the workspace was best practice based on the research that I performed. However, that research was limited to simpler use cases (as well as my personal experience, which didn't require managed virtualenvs, like this issue proposes). This obviously doesn't work for your use case, which is why I'm going to implement these changes.

kkarolk commented 5 years ago

@cstarner I'm also interested in this feature and I'm really happy that you're about to implement it. Do you know (roughly) when it's going to be available?

cstarner commented 5 years ago

This will be provided within the next release, which should be coming in the next day or so.

jenkinsci / pyenv-pipeline-plugin

Support for unmanaged, pre-existing virtualenvs? #16