Use virtualenv instead of conda

allenai / deep_qa

A deep NLP library, based on Keras / tf, focused on question answering (but useful for other NLP too)

Apache License 2.0

404 stars 132 forks source link

Use virtualenv instead of conda #381

Open schmmd opened 7 years ago

schmmd commented 7 years ago

We use conda to create a consistent environment for Python. Is there a reason we use conda over virtualenv?

I was reading the AI2 Python Guide and it recommends using virtualenv. Are there any concerns if we used virtualenv rather than conda (other than the fact that we presently use conda and it works).

matt-gardner commented 7 years ago

Where is this actually used? Just in the build scripts? I don't actually use conda's virtual environments or virtualenv, so it doesn't much matter to me. I don't think this is a decision that should be enforced anywhere in the repo, and the build scripts should just do something that works.

Also note that the python guide was written by people who mostly write scala code, or by S2 production folks, not by aristo or s2 researchers who had already been writing lots of python code. Not saying that the guide is bad, just that I don't particularly want to be held to top-down decisions that were made after I already had a large python codebase.

schmmd commented 7 years ago

@matt-gardner the Python guide was largely written by Joanna, who has been programming in Python for many years and considers herself a Python expert.

I found conda in the Dockerfile. I don't want any top-down decisions on this repository, but I do for consolidation where it makes sense (either there's no preference or "the standard" turns out to be a better option).

matt-gardner commented 7 years ago

I don't have a strong opinion on conda vs. virtualenv, except that changing it requires work, and I want to avoid work that isn't high-priority. If it's high priority to you, feel free to change it. It just doesn't seem like a big deal to me. The thing I was particularly squeamish about in the python guide was yapf, which makes different formatting decisions than our current pylint settings, would be a ton of work to reformat the whole code to conform to it, and was added by someone other than Joanna.

schmmd commented 7 years ago

Thanks for clearly sharing your perspective. Learning things like that you don't use "conda" is useful to me. In going through the code base, I assumed that it was standard technology that you all used to get a consistent Python environment.

My main goal is modifying the README so people get get started with deep_qa easier. I had a rough time, and it was compounded by not knowing Python or Deep Learning very well.

I'll leave this issue open for now, but re-assign me. I think it may be possible that we neither use conda nor virtualenv since we are already using Docker.

matt-gardner commented 7 years ago

I wholeheartedly agree with making the README easier to use. Thanks for doing this.

cristipp commented 7 years ago

FYI, we've used vanilla virtualenv for a while in Euclid. It doesn't actually build self-contained python envs, instead base packages are inherited by the virtualenvs. This leads to arbitrary breakage when installing base packages.

We ended up using https://github.com/pyenv/pyenv-virtualenv, which builds self-contained python versions + self-contained virtual envs on top of that. Seems similar with what conda provides.

Frankly, using docker provides better isolation, without the risk of re-discovering the 1001 pitfalls of managing multiple python version on the same box. Additionally, it's the first step into shipping the code to a kube cluster.