realpython / python-guide

Python best practices guidebook, written for humans.
https://docs.python-guide.org
Other
28.38k stars 5.83k forks source link

'Conda' or 'pip' #776

Open baif666 opened 7 years ago

baif666 commented 7 years ago

Which one do you think is better for beginners? And you don't talk about Conda in this book, so dose Conda have some problems? (P.s: This book is really useful.)

kennethreitz commented 7 years ago

pip is best

kennethreitz commented 7 years ago

conda is for scientific use, and is covered in this guide, i believe. perhaps is needs more mention for those users

baif666 commented 7 years ago

Thanks for your answer. I'll look for it again.

rsvp commented 7 years ago

Trying to manage an environment for machine learning by pip alone is difficult due to delicate dependencies involved among the many necessary packages. Conda does an excellent job, especially where binary libraries are involved -- and the support for the IPython console and Jupyter notebook is superb.

For scientific replication purposes, consider using pip or conda within a Docker container to truly maintain an environment which can include your own particular data files. Here is an example for financial economics: https://hub.docker.com/r/rsvp/fecon235 where Anaconda and Jupyter are ready to run on any machine (Mac, Windows, Linux) instantly without going through dependency hell. Utilities like git are also pre-installed so you can experiment, then simply discard the container.

cc: #714

baif666 commented 7 years ago

yeah, many person recommend it to me

baif666 commented 7 years ago

but many developers use pip to install their projections instead of conda, so I use pip again.

arunavkonwar commented 7 years ago

pip is straightforward and is easy to understand. I have seen a lot of people get confused by conda. My vote is for pip. :)

miso-belica commented 7 years ago

Blog about Conda myths may be interesting reading for you :)

rsvp commented 7 years ago

@miso-belica : +1: on your previous link.

Roughly speaking, I think of pip as a configuration manager where its text source may not adequately capture necessary dependencies, for example:

For scientific users, conda also allows things like linking builds to optimized linear algebra libraries, as Continuum does with its freely-provided MKL-enabled NumPy/SciPy. Conda can even distribute non-Python build requirements, such as gcc, which greatly streamlines the process of building other packages on top of the pre-compiled binaries it distributes. If you try to do this using pip's wheels, you better hope that your system has compilers and settings compatible with those used to originally build the wheel in question.

Having the correct binaries is crucial, for example, MKL which is the Math Kernel Library produced by Intel.

The blog concludes by:

If you want to install Python packages within an Isolated environment, pip+virtualenv and conda+conda-env are mostly interchangeable.

But an important aspect of "environment" may be its full reproducibility (say for research or for debugging purposes). This may encompass specific versions of compilers and system utilities. So this is the primary reason why Docker containers would be relevant to our discussion -- they can serve to freeze an specific environment, but its components can also be updated by pip, conda, or even, apt-get, and git pull.

lyndsysimon commented 7 years ago

I've had good experiences so far with new users who have used conda over pip. My opinion is that new users should absolutely understand pip and be comfortable with it, but that conda is easier to get up and running as a first step.

As new users gain competency, it's extremely common for them to end up installing multiple Python interpreters and end up both very confused and with an environment that's difficult to understand. On macOS this progression usually goes "system Python" > "brewed Python" > "virtualenvs based on the brewed Python". Conda seems to help avoid the mess by keeping everything in one directory during this stage of the learning process and put off learning how to deal with the system path, multiple installed Python binaries, and broken virtualenvs until they need to deploy their application in a self-hosted environment.