Closed hholst80 closed 2 years ago
This was proposed in https://github.com/pypa/pip/issues/6409, was implemented in https://github.com/pypa/pip/pull/9394 and has been discussed in https://github.com/pypa/pip/issues/10028. There's likely a lot more discussions, but I ain't spending more of my time digging those up.
Quoting myself from https://github.com/pypa/pip/issues/10028#issuecomment-885868343:
I don't think there is a way to make it possible for experienced users to not see this warning while also making sure that it serves the purpose of getting inexperienced users to understand that they should not do this in general.
Do read the linked comment above, before responding.
If you have a response that's different from "I don't like warnings." -> "Gimme an escape hatch." (which will become the top-voted Stackoverflow answer, likely without sufficient context to help inexperienced users) -- color me very interested.
Until something like PEP 668 is implemented and generally available to the point that I am comfortable with dropping this warning, I don't think it is a good idea to provide an escape hatch.
I do not want my users to think there is anything wrong my system just because of the pip tool spews out indiscriminate warning messages.
There is a risk though. If you run sudo pip install
that modifies $package and your OS depends on $package, you've quite possibly broken your OS.
I'll note that I'm speaking for myself, and not the other pip maintainers.
FWIW, I agree. I think that the warning helps more people than it inconveniences. And anyway, the people who know enough to be sure that they are safe are also capable of suppressing the warning (pip install 2>&1 | grep -v "pip as the 'root' user"
, for example, although IMO anyone who couldn't have constructed that themselves probably doesn't fully understand the risks of running pip as root...).
Suggestion: If the log warning is made into a warning via the "warnings" module I can solve this within the existing framework of that. Or, if that is not feasible, check for /.dockerenv
and if that exists do not spew out a warning because it is a hosted container environment and most likely the user is running a python environment based on pip, knows what they are doing, or a combination of both.
It's done through the logging module, although, we might change how we output things in the future. :)
Yeah. Checking if you are in container environment would solve vast majority of problems of people who want this warning removed. I think "running root in container" is equally good reason to skip the warning as "running root in cygwin" or "running on windows". Happy to make PR if I know there is a consensus for that one.
Or maybe even just changing the message to say "If you are in container, it's usualy OK to run pip as root". That woudl be more "factual".
Please, can we not have this debate again? Others have already proposed "disable the warning if you're in a container" and we've responded to that (no, the warning is still valid there - I'm quoting others as my personal experience with containers is limited, so don't bother trying to engage me in debate over this). Repeating arguments that you could have found by searching the tracker for previous discussion on this topic isn't likely to change anyone's mind here...
Well. I do follow that discussion and I have not noticed that. Really sorry I should have checked more carefully.
But I have not seen anyone propose better error description, one that might help people who wonder if the warning is valid for them or not. I think you should be empathetic towards people who have their own users, and have to continue explaining them "Yeah the warning is there, but this is container so this is right". How about explicitly adding explanation that in container it is likely ok ? Still warning, a more reasonable message actually reflecting the reality. What's wrong with that?
It is still possible to modify system-package-manager installed packages, using pip inside a container. That can still break things in weird ways.
PEP 668 will bring in the protections necessary, so if someone really wants to get rid of the warnings, it'd be more impactful to help that effort move forward. You can still try to convince us that the wording should be tweaked or the message should have additional conditionals, but don't be surprised if I'm responding on the PEP 668 discussion and not here. :)
As it stands, there's a risk to running pip as sudo regardless of whether you run it on your local terminal, in a container, or on a remote machine. Outside of mitigating that risk (part of which is done by PEP 668), all that we can do is warn users about it; and that's what this message is doing.
I think you should be empathetic towards people who have their own users
That is a valid point, and I'm sorry for not considering it. Do you supply a copy of pip with your application? If so, then you can wrap it to suppress the warning. If you don't, then I'm not sure how you can be so sure your users are using sudo pip
safely? I guess if they are following instructions you provide on how to set up the container, you can know they are not doing anything unsafe, but then why not just add to your docs a note that pip issues a warning that doesn't apply for people who are following this particular set of instructions? After all, if they are not reading your docs to see that note, they probably aren't setting up their container the way you advise them to either!
I hope this helps.
That is a valid point, and I'm sorry for not considering it. Do you supply a copy of pip with your application? If so, then you can wrap it to suppress the warning. If you don't, then I'm not sure how you can be so sure your users are using sudo pip safely.
Just to explain my case.
Yep I am sure it is ok. This is because we have our own Dockerfile https://github.com/apache/airflow/blob/main/Dockerfile which is very versatile and you can build your image using custom docker build .
commands providing multiple arguments:
https://airflow.apache.org/docs/docker-stack/build.html#examples-of-image-customizing
For example you can build custom Airflow image like that:
docker build . \
--build-arg PYTHON_BASE_IMAGE="python:3.8-slim-buster" \
--build-arg AIRFLOW_VERSION="2.0.2" \
--build-arg ADDITIONAL_AIRFLOW_EXTRAS="mssql,hdfs" \
--build-arg ADDITIONAL_PYTHON_DEPS="oauth2client" \
--tag "my-pypi-extras-and-deps:0.0.1"
But currently while doing it you have quite a few root
warnings. This is not a deal-breaker though. If you have an underlying PEP 668 and clear way how to solve it in the future, I am quite ok to wait and explain the users this is fine. But I find it difficult to accept "turning a blind eye" on such use cases.
If the warning is going to stay there forever and there is no solution to solve it in the future, then I'd really appreciate a bit empathy and understanding and at the very least acknowledging and mentioning that there are cases that are valid so that your users do not have to explain their users "yeah ignore that - those guys are just over-protective and the warning really makes no sense, however I have no way to disable it in a reasonable way".
And surely I could "grep-out" that message in my image.
But the message has already changed several times so I would never be sure it that grep continues working. We are continuously updating to latest versions of PIP as soon as it is released but there are (already were!) cases where PIP version resolver broke dependency resolution (hey this is Airflow with 500 dependencies). So one of the options of the image customisation is also choosing the pip version: https://airflow.apache.org/docs/docker-stack/build-arg-ref.html#basic-arguments - just in case users will have to use previous version of PIP - and there, the message could be different.
Again - this is not HUGE problem - it's annoyance, and we can definitely live with it for a while but I just wanted to explain that we are not just "moaning" - this is a real use case, real problem, real user annoyance and the workarounds suggested (like greping out the message) are band-aid at most.
Until PEP 668 goes somewhere I don't thinkwe have a choice. You think this is an annoyance, but distributions also hate people writing to their package store (yes, even in containers, because the distro doesn't realy have knowledge it's in a container), and pip is stuck in the middle of that tension.
Sure. No worries. I understand the "pressures". As long as there is a long-term plan how to tackle it, this is perfectly fine to continue this route.
I hit the same warning today (also building a docker image for system use). I see two related problems here. First, it is RED and not yellow or any other less dominant color. Another thing is that it is not a good practice for anyone to get used with errors. One day you'll miss some important error because your brain just ignores it.
For what it's worth, we want users to not run pip as root, not get used to the error.
You have to run it under root if your target is to install the same packages for all users in that particular image or server. Is there any (good) alternative for that? Let every user install the same packages and versions under their venv and every time update all these vnevs together? Or make one global venv for all users which is not much different than install under root in the first place.
To be clear, the error only appears when you run pip as root, directly on a Python installation. If the goal is to provision the installation across users, it'd be best to use a virtual environment instead. That is enough to suppress the message. And before we go there, yes, we do think it is still best practice to use virtual environments in a container.
Do you mean an virtual environment like python -m venv .venv
and source .venv/bin/activate
?
Alpine based Python 3.10 Docker image is 45.4 MB. If you set up a virtual environment, you just duplicate 1/3 of of data (+15 MB).
What is the rationale behind that? I'm very strong proponent of virtual environments and use it as much as possible but I don't see the point here. It sounds like if all you have is a hammer, everything looks like a nail - like virtual environments, hammer is also very useful tool.
What is the rationale behind that?
Ultimately the rationale here is that we have had very strong representations from Linux distribution vendors saying that they do not want people using pip to install Python packages into the area owned by the system package manager. Pip is very much "piggy in the middle" here - we cannot win, as we have conflicting demands from two key parts of our user base.
PEP 668 is the long term solution here. In the short term, it seems pointless to change something in pip just so that the other half of our user base will be yelling at us 🙁
I actualy sympathise with all 3 parties there:
similarly as @Cougar I do not want to use virtualenv for Docker image building. Not mentioning the "alpine" image growth, It makes little sense, complicates making stuff like copying installed --user
python installation between segments for multi-segmented image to make image even smaller (this is what we do in Apache Airflow for example: https://github.com/apache/airflow/blob/e5422f0233b993acfe7c881dfa72178e662f8e46/Dockerfile#L444 - unlike using --user
flag (or other flags to install stuff elsewhere) doing that wit venv is brittle and not guaranteed to contain all required libraries
I understand the problem of Distro people
And I also understand PIP people in the middle of that and I perfectly understand PEP 668 is the right solution - we discussed it before above and I am perfectly fine with it.
On the other hand we do not know when PEP 668 is going to land - and I also agree with @Cougar that false negatives which cannot be easily disabled is a wrong thing.
However - looking at the discussion above I think there is one thing that CAN be done that will satisfy everyone here. Sort of win-win-win situation
@pfmoore - you mention that the rational is that Linux distro vendors are saying they do not want people using PIP to install Python packages intto the area owned by the system package manager. Similarly @uranusjr mentions that introducing venv disables the warning. However for building image case introducing venv is NOT a good solution. On the other hand using --user
flag or (similarly) using --target
is much better and straightforward solution for Docker image building.
Also, coinciendently it happens that both --user
(and --target
flag if target is not using the system directories) also should not make the distro
people angry. Because it does not touch the files they are worried about.
So what I really think is GREAT solution for everyone - if that this warning (in RED) is NOT printed if the --user
flag is used or --target
flag does not point to any of the distro "sensitive" directories. I think simply that using "DO NOT USE ROOT" as a message in this context is simply wrong. The message should be "DO NOT OVERRIDE SYSTEM PACKAGES". And both --user
and --target
flag should be tretated as "perfectly OK" when run as root.
Is there any drawback to this proposal? Maybe I have not thought about something, but It seems we have a very easy solution that satisfies everyone in the discussion and we do not have to wait unti PEP 686 materializes.
I thought the requirement was "to install the same packages for all users" (see above). --user
won't do that. Also I don't know what the root user's home directory is, so I can't say it's OK. --target
sucks, because it doesn't support upgrading, and it has a load of weird edge cases. It's not designed for this situation, and we'd probably get the problems reported as bugs if we started recommending it.
And in any case, without PEP 668, we don't know what are "distro-sensitive" locations, so how would we confirm that?
The drawback is the same as always - it doesn't fix the issue, it just changes the group of people who complain at us.
I thought the requirement was "to install the same packages for all users" (see above). --user won't do that.
Well. Actually this is precisely what --user
flag allows when it comes to container images (and we are successfully doing that for more than one and half year in Apache Airflow). The --user
case is simply very close to he (recommended by PIP maintainers) venv
but better.
It creates a separate, isolated environment where we have not only all packages installed but also all the '.so' and other dependencies installed in one 'folder' that is easy to copy and use. And we can easily make it 'local" for any user running the image, which effectively allows "to install the same packages for all users" (https://github.com/pypa/pip/issues/10556#issuecomment-945960306). This is precisely that we do in Airflow image - and its not our isolated case - we are simply following OpenShifft recommendations for images (https://docs.openshift.com/container-platform/3.11/creating_images/guidelines.html - look for "Support Arbitrary User IDs". Our image sets the same "HOME" directory for EVERY user. This means that ".local" directory is THE SAME for every user. And it means that literaly "we install the same packages for all users"
--target sucks, because it doesn't support upgrading,
I quite agree with that - that's why we use --user
for that (and super-happy with how it works). I am perfectly OK to drop --target
from my solution. Leaving only --user
flag (i.e. do NOT print the warning when pip --user
flag is used. full stop).
I believe (please correct me if I am wrong @pfmoore) that it will not "write to system packages" - so the properties of that solution are :
--user
flag excludes writing to system packages --user
as an exclusion should be ok as well--user
flag to install airflow in non-system place, at the same time making it available for re-use for all users, following the best guidelines out thereDid I miss something @pfmoore ?
Running --user
on most systems will install into ~/.local
, which for the root user is /root/.local/
. I'm pretty sure that's not what folks want when they say "shared across users". They want to put it in a global environment.
--user
is NOT a virtual environment, and doesn't have one of the more important properties of a python -m venv .venv
style environment -- isolation from the system. If you try to pip uninstall six
from a virtual environment, pip won't try to uninstall six
from the global environment (it'll say "Not uninstalling six at {path}, outside environment {venv_path}"). Same for pip install --upgrade --user
.
I think simply that using "DO NOT USE ROOT" as a message in this context is simply wrong. The message should be "DO NOT OVERRIDE SYSTEM PACKAGES".
The exact warning is:
"Running pip as the 'root' user can result in broken permissions and "
"conflicting behaviour with the system package manager. "
"It is recommended to use a virtual environment instead: "
"https://pip.pypa.io/warnings/venv"
It's neither all-caps, not a blanket message to do anything. It communicates that there's a risk, and recommends a way to mitigate that risk.
The problem at hand is things like sudo pip
usage, as well as any usage that could modify system packages and interfere with the OS packages. This is at odds with users in Docker being root-by-default and some users not wanting to do anything to avoid modifying the system packages. Or that users who want to install into a global environment that's shared across users get this warning. Using USER with Docker is explicitly listed as a best practice for Docker environments.
Note that Docker/container environments are NOT the only usecase here. There's significant portion of users on other Linux-based platforms, who face the same issue, where messing up the system packages with pip can mean that they're unable to use their PC after a reboot. This message is currently nudging both user personas to a best-practice that can help reduce risks.
Apoliogies for "all caps". It was more to emphasise the meaning the message brings (making root as the "root of all evil" and the only possible and recommended way of handling it being virtualenv - but I understand it could be understood as me shouting. Lesson taken. I will avoid all caps.
Seeing how strong the PIP maintainers are opposing any proposal to improve their message and possibly even educate their users "better" I kind of lost hope that it will get any change of improvement. It's a bit sad on one hand, and I would have understood it if it was a huge investment and big diversion from current policies and work, but I am not sure this is the case.
Therefore I treat this more as an educational discussion - where I (and others looking at it) might learn what are the deep "root" reasons for the message (I think it is not really clear from the message that it has been driven by distros) and how it relates to the in-docker experience (which I think most of the proposals to improve the message come from). Also I see that as a chance for PIP maintainers to learn some way their software is used in legitimate (and useful) ways.
I always try to hear to my users at Apache Airflow and even if I see that they are using it in different way than I originally anticipated, I try to be open and at least make life easier tor our users if it costs us very little. Improving error message, making it clearer, and responding to the needs of our users who have some legitimate doubts has been something I was doing for many months now (which resulted in many improvements to our docs and messages printed). But there are of course cases where I hear, listen, acknowledge that there are some good reasons why our users want something that they will not get, so that's ok for me if it will remain how it is for now (though i still think educational part of the discussion is not exhausted yet, so I will add some more context and explanations. Maybe eventually it will lead to at least better understanding of the problem at hand by all parties (and possibly it can even lead to better PEP 668 implementation - who knows if PEP 668 will be equally good for this kind of contenerised environment that are now prevalent in K8S-driven deployments..
Running --user on most systems will install into ~/.local, which for the root user is /root/.local/. I'm pretty sure that's not what folks want when they say "shared across users". They want to put it in a global environment.
1) Running --user
on it's own does not make the software available to many users on it's own. This is true. However with Container and Kubernetes (and especially in case of OpenShift which pioneered that approach) there is a case where single Home directory can be shared by many users. This is what (https://docs.openshift.com/container-platform/3.11/creating_images/guidelines.html - look for "Support Arbitrary User IDs".) comes into place. In K8S environment one of the best patterns is to allow arbitrary users (belonging to 0 group) to run inside container. This has multiple advantages and it is far superior than plain USER directive in Dockerfile (althought it can happily co-exist with USER directive). The USER directive is from pre-K8S times and it is very limiting because by default in "linux" environment it requires the user to be available on both Host and Container if you want to share data between the host and containers. This kinda break isolation between the two and the approach promoted by OpenShift fixes that. After working for many years with Docker/Containers/K8S I foudn the OpenShift approach both simple and powerful.
Using USER with Docker is explicitly listed as a best practice for Docker environments.
2) True that USER directive is recommendended for running the container. And to be honest - this is precisely what we use in Airflow: https://github.com/apache/airflow/blob/main/Dockerfile#L461
However this is pretty old recommendation that has already been (partially) invalidated in a specific case - namely multi-stage builds (Docker recommendation here: https://docs.docker.com/develop/develop-images/multistage-build/) which have been implemented way after the USER directive was introduced. The multi-stage builds is the practice we use at Apache Airflow as well, in order to significantly decrease the size of the image. We simply install all our PIP dependencies and libraries (using --user
flag) to a "/root/.local" directory in the "build" stage and then copy the entire directory (with all the resulting libraries and python packages) to the "final" stage. This allows us to save at least 25% percent of the size of the final image (we do not need build-essentials
and a lot of libraries in the "final stage". It follows all the best practices of Docker image building. Those practices are also such that no "USER" directive is needed is the "build" stage. It's much beter if everything is run as root
user here. There is no need to use sudo
you are installing everything as root
user and you do not need to add extra steps to create a separate user, simply because this stage is only used to build the artifacts that will be copied to the final stage. No danger involved, very simple and straightforward if we use "root'" user for that.
The exact warning is:
"Running pip as the 'root' user can result in broken permissions and " "conflicting behaviour with the system package manager. " "It is recommended to use a virtual environment instead: " "https://pip.pypa.io/warnings/venv"
I see a seriousl problem with that message. It's misleading and confusing.
It informs the users that "root" is the "root of all evil". On the other hand the remediation, does not even mention "run PIP as different user". Instead it mentions "use virtual environment", wich kind of contradict the problem statement. Is "using root" a problem? Or "not using virtualenv" ? Or both. It's not clear from the message also that this breaks policies of various distros.
If we agree (I have not seen any argument against it so far) that --user
handles the problem with "broken permissions and conflicting behaviour".
How about a little improved, more precise and more "factual" message
"Running pip as the 'root' user can result in broken permissions and "
"conflicting behaviour with the system package manager which is"
"against the policies of many distributions of Linux."
"There are several ways it can be handlied:"
" * use a different user to run PIP than root"
" * use virtual environment https://pip.pypa.io/warnings/venv
" * use `--user` flag"
I think that - or similar - kind of message would be much more precise, propose several different solutions to the problem and explain also a bit more context on why the message is there in the first place.
While I appreciate the extensive reply, you're addressing the wrong audience IMO. You need to persuade the people (most notably the Linux distro maintainers) who support the current behaviour that it should be modified as you suggest, not the pip maintainers (who are looking for consensus, not a competing proposal). Those people were involved in developing PEP 668, so they understand the background well.
I was under the impression (please.corrct me if I am wrong) that the distro people have not decided on the exact message and conditions when it is printed.
I believe the ask was (i would really like to understand that) 'warn when there is a risk of modifying system locations' and not 'print message when you use root'
Is that message something explicitly requested by distro people ? Or is it something that PIP maintainers decided about (i.e. condition and warning content).
What your answer suggest is that PIP maintainers have no power to control their messages to their users and no power to correct them if they are misleading which I find pretty confusing and hard to believe?
But if that's the case and we need permission or opinion from the distro people - whom can we mark here so that they can have a say here (if the opinion of PIP maintainers is not enough)? I am happy to drag them into this discussion. I personally am in favour of bringing in other voices to the discussion - especially if it seems that those 'others' have a final say here.
Sigh. This is the last time I comment on this issue. That's not what I said or meant. I said we're implementing something that one part of our user base supports. Another part of the user base (so far, two people on this thread) have said they disagree. Unless the people who support the current wording weigh in to say they support a new wording, we're not going to change and risk just annoying a different part of our user base.
We're not experts in this matter (I'm definitely not, as my main platform is Windows) so we rely on the expertise of others. When people claiming to be experts disagree, what should we do? I say, stick with what we have (which includes a longer-term "proper fix"). We don't need more churn for our users.
But as I say, I'm done with this. It no longer feels like a discussion, and I'm either not making my point or I'm being deliberately misinterpreted, and in either case I'm not adding any value here.
I can't speak for others but I was a bit confused by your response @pfmoore and had to re-read the message a couple of times and go back to the original issue to understand things, I definitely had some misunderstandings about the motives of the Pip maintainers for putting this message there and that's what caused the confusion.
My interpretation now is as follows:
The message that pip is giving comes from the guidance of Linux distro maintainers. To change it the pip maintainers would want consensus from that group that the new message works for them, as this is outside the expertise and knowledge of pip to dictate.
Looking at @potiuk 's messages I think the proposal is to add additional information on what might be possible solutions if you run under root.
I would note that in the original issue https://github.com/pypa/pip/issues/6409 that the --user
flag (one of the possibilities that is being proposed to include in the error message) is indeed mentioned as a possible suggestion. So it may not actually be too hard to achieve consensus on some additional suggestions to this message, but I do not know how one would go about contacting and getting consensus from Linux distro maintainers.
I also think @pfmoore you might have taken it a bit too personal. What my intention was is not to twist your words, but to make sure I understand it - so i paraphrased it with my own words and understanding. This is very typical approach in any kind of discussion - where you want to make sure you understand the intention and interpretation of the other party
My intention is simply to make sure I understand how to proceed further without further confusing my users by misleading message (because I think the current message is misleading). Even if we do not agree with adding '--user' as an explicit exception, i think the current message stating that 'using root is wrong' and 'using venv is solution to that' is confusing like crazy because the cause has (without going into details and reading the whole PEP) nothing to do with the solution.
I think we either should explicitly state all solutions or clearly state that the problem is not using venv in the first place. Confusing 'using root' and 'using venv' is i think the root cause of the whole discussion we are having here.
I just re-read all PEP 668 and the discussion that led to it. And I understand all context better. I understand (again - this is my paraphrasing and understanding of it) that the intention of PEP writers was to 'gently' guide users into using venv while acknowledging that there are cases where it is actually not always the 'best' solution. There are some exceptions explicitly stated in PEP - like Sphinx extensions that should be system-wide for example and special treatment of containers.
But by re-reading that I also realized that there is one problem with PEP 668 that actually undermines a basic assumption in the discussion above - namely that PEP 668 will solve the Docker container issue problem when implemented by distros. Which I believe is a false assumption. From the PEP:
Distros that produce official images for single-application containers (e.g., Docker container images) should remove the EXTERNALLY-MANAGED file, preferably in a way that makes it not come back if a user of that image installs package updates inside their image (think RUN apt-get dist-upgrade). On dpkg-based systems, using dpkg-divert --local to persistently rename the file would work. On other systems, there may need to be some configuration flag available to a post-install script to re-remove the EXTERNALLY-MANAGED file.
How i interpret that: Even if PEP 668 is implemented by both Pip and distros (debian in our case), the distro which is base for the container build will not have the EXTERNALLY-MANAGED
marker. Which means that in all cases where i am building an image based on base distro image i will still get the warning when running as root and not using venv. This is how I read that paragraph (also because it is explicitly stated in the same PEP that if EXTERNALLY-MANAGED
marker is missing - pip
will behave as before the PEP implementation.
Few questions:
Is that correct interpretation?
What will then be suggested approach to solve it long term (i.e. How to use PIP without venv in a docker container so that no warning is generated)?
Does it mean that Pip forces us to use venv if we want to get rid of the warning?
Or maybe using pip --user
with non root user (and no venv) will also work ?
Should the warning suggest both solutions as 'correct' if that's the case ?
@pfmoore - please don't get me wrong. I am not trying to stir the waters and i am not trying to undermine anyone's positions or 'twist someone words'. I really want to understand why are the limitations. Make sure that my users get error messages that are factual and consistent (and actionable) and that i have a long term solution which i can apply - understanding the reasoning and context. I think PEP 668 is very well written, it takes a lot of things into consideration including the '--user' flag and containers. But - as usual - there might be other findings that could be discovered after the PEP is written and some particular cases that need to be treated differently (or even some things are not ultimately specified in the PEP).
While I understand your gravitation towards venv, i think even the PEP itself is all about guiding people in it's direction, not forcing it. And while i perfectly understand why Warning is fine for the guiding, even the PEP itself acknowledges there are exceptions. So having a solution for such exceptions especially that it seems that it won't be solved by PEP is i think rather reasonable ask.
All I am asking for is to be treated seriously.
I look for a rational discussion on how to solve real problem. I think I tried to be nice and polite a and tried to put forth some context and facts that might not have been known or realized before. I try to get to the point where different parties are heard and possibly even their opinions and problems are considered and maybe even addressed if reasonable and easy to do. I put some proposals which I think it's worth to comment on rather than abruptly stop discussion for - apparently - no reason.
That's all I am asking for.
As a general note around GitHub ettiquete, if someone explicitly states that they're stepping away from a discussion (which implies that they're unsubscribing), I think it is a bit rude to @-mention them in the same thread (which both subscribes them and triggers a notification) in a somewhat immediate follow up.
You wouldn't drag a person back into an in-person discussion, if they say they need to step away for a bit -- at least, I hope so. Don't do that in a digital space either. :)
You wouldn't drag a person back into an in-person discussion, if they say they need to step away for a bit -- at least, I hope so. Don't do that in a digital space either. :)
Sorry - my bad, won't happen again. I also apologise for a few mistakes and typos - I wrote it on my phone while on holidays (I corrected them as I am back at my PC).
I think my proposal is not "against" the spirit of PEP 668 - I believe it improves the message to be consistent and (possibly, if others agree this is is in-line with PEP 668) add a non-root --user
flag that can be used inside the containers without raising a warning (nor using venv unnecessarily).
Following the advice however - I would really want to hear what other creators of PEP 668 think about it, And I would love to get answer to the question whether PEP 668 actually solves the problem of having a warning in container. @pradyunsg - I see you are one of the creators, I also @uranusjr (I do not want to call others who did not take part in the discussion unnecessarily - I hope they see it and will respond if they have an opinion).
Just to reiterate where my understanding of the current state of the discusion is:
I think the error message is misleading - it mentions "running as root" where solution is "using venv" which (without reading the whole PEP) makes little sense as the solution has nothing to do with the problem seemingly. My proposal is to improve the message to be consistent and mention other options of getting rid of the warning (depending which ones will be valid) : for example if runnng pip install with --user
as non-root (especially when in-container) will remove the warning. it seems like a good idea to mention it, not only the venv one.
seems that (I would love to hear comments) contrary to earlier assumptions in the thread implementing PEP 668 is not a solution for in-container builds. The recommendation for distros is to remove EXTERNALLY-MANAGED
marker in base container images which (as I undersand it) will keep the warning (unless I use venv). In some cases using venv might lead to huge increase in the size of the image (+30% in case of alpine image - see https://github.com/pypa/pip/issues/10556#issuecomment-945973598) so possibly venv should not be the only solution to get rid of the warning - even if it is the preferred one.
I do not think I am violating the spirit of PEP 668 by proposing to improve the error message and remove warning pip --user
for non-root user is used. I do not see those as "against" each other rather than "fulfilling the spirit" in case of container builds.
Could others comment on that - do I understand it correctly? Is there another "jumping to conclusion" which I did without being involved in earlier discussions?
- so possibly venv should not be the only solution to get rid of the warning - even if it is the preferred one.
Watch out for bias. It is defensively not the preferred solution in a container environment, and I would go so far to say it is a direct anti-pattern to do so, in a container environment.
- so possibly venv should not be the only solution to get rid of the warning - even if it is the preferred one.
Watch out for bias. It is defensively not the preferred solution in a container environment, and I would go so far to say it is a direct anti-pattern to do so, in a container environment.
I personally quite agree. As I was reading the PEP668, the linked discussion and got comments here i believe the problem is that the 'consensus that venv is the solution' did not take into account image building (not even running python on container but image building which is quite a different thing and very important general use case for PIP).
I would really appreciate PIP maintainers here to give us a hint on how to solve the conundrum (as I understand it now) of false warning when PEP 668 explicitly recommends removing the marker - which results in the warning, forcing people who prepare images to use this Antipattern you mentioned. Or maybe explain if I'm wrong and PEP668 will actually give us the opportunity of removing the warning while not having to use venv.
This has been mentioned before and It was explained that we have to wait for PEP 668 (and I was quite ok with that) but seems like now we have no clear solution for image building even after PEP 668 is implemented.
I'd love to hear a comment her (maybe I am just wrong in my assessment which I would also love to know about).
I would go so far to say not using a virtual environment in a Docker container is an anti-pattern. The difference between a Python in and out of a virtual environment during image building is also minimal; the only difference is the command you use to invoke Python (and executables installed under that Python). PEP 668 does not get into this too much because this is a topic people have strong opinions on, and discussing the topic presents an unnecessary risk of derailing the PEP discussion, but if you read between the lines, installing things to the system Python (in a container or when building an image) is more like a thing tolerated by the mechanism since way too many people are doing it, rather than something the PEP authors (well, at least I) think is too brilliant for the PEP to break.
If that's the position of the PIP maintainers that venv is the ONLY way, and forcing venv is mandatory, then i think we have no other way than to follow it, because seems that decision is already set in stone and no matter how many arguments we throw they are not strong enough to change the mind of PIP maintainers.
But I think if such decision is made by PIP maintainers this should be very clearly and straightforwardly stated in the error message.
I think the message should be in this case ' you are not using venv but this is wrong. Please use venv: link to the docs'. I understand (again I would like to make sure that I understand it right) that it has nothing to do with using or not root user? Or am I wrong and the only proper way is both 'not using root' and 'using venv' ? And all the other ways of using PIP are simply 'wrong'?
Could you please clarify if i understand it correctly ?
Also i think PIP maintainers should understand the consequence of that - I understand then that venv is the only way I can get rid of the warning. This also means that people using alpine will have to pay the price of having much bigger image - because they have no choice. In case of Debian/Ubuntu it is far less of a problem (and probably neglectible). Also i know for a fact (there are many articles about it and I had the same problems) that alpine image is quite bad for any serious Python installation (because of it's musl library and still some incompatibilities between libc and musl). So i would understand if this is a deliberate decision: ' yes, we know that venv will increase alpine's size significantly but since it's support for python is problematic anyway, this is a conscious choice made and we realize the consequence when making this decision'. I think it would be great to make it clear in the PEP or other accompanying discussion (maybe simply we can link that discussion to PEP as follow-up discusion so that others can find it and understand that this decision was conscious and taken deliberately and not a mistake)
However I need your help as well to understand how to do what I did with --user (which I understand is also 'wrong' way of doing it). I would be happy to follow the venv route in Airflow image but I am not sure if I have the same guarantees.
When I use '--user' flag, i am 100% sure (this is clearly described in the docs) that all the dependencies (including the .so runtime libraries, any local files etc) are placed in '.local' directory and copying it to another image/user is going to work. I do it for more than two years now, and yeah - it works as expected. I am happy to change it to use venv but now I need some answers:
Can I do the same - copy .venv
dir to another image and it will work (assuming that I make sure I will activate venv in .rc file for all users)? Will it work for all the shared .so libraries that are built along the way for packages that need compilation and will I only get runtime version of those libraries ( i do not want the dev libraries as they unnecessarily - and a lot - increase the size of the image)?
Can I use 'root' user to create the venv or should I create a new non-root user for it (i want to make sure that when I change it now, it will continue to work also after PEP668 is implemented)?
In case of OpenShift- compatible images we need arbitrary users to run in the image - all of them should use the same venv. I will create the venv with the right umask so the group has write access for all users, but my question: is it enough to use the same home dir for all the users ? Will .venv folder and running 'activate' in .rc files for arbitrary user work ? Any side effects ?
The most important problem (i currently have no good solution to that). How can I make sure that the venv is activated when my user 'extends the image' ? In Docker, every RUN command is run in separate shell. So even if you run 'activate' in one RUN, the next RUN in Dockerfile does not have environment variables defined that usually are set when you run 'activate' (basically source
command does not work in Docker RUN command the way you are used in terminal sessions). Usually the users extend the image in this way (and this is what we have as examples in airflow documentation: https://airflow.apache.org/docs/docker-stack/build.html#adding-a-new-pypi-package
Dockerfile:
FROM apache/airflow:2.2.0.dev0
RUN pip install --no-cache-dir lxml
Our users will simply do that because this is how everyone is used to install packages. In this case the RUN command will not have the venv activated and i see no easy, future-safe way of doing it.
We cannot just instruct people to prepend activation for every RUN pip
command they run - the experience is that big part of the users will not read the documentation and will use the most obvious way. This is also the reason why wy have PIP_USER variable set so that the 'plain' PIP command will automatically install packages with '--user' flag. Before that we had plenty of issues where people did not add '--user' flag which we had in the docs and ended up with broken installations where they had the same packages installed in multiple locations.
I would love some guidance on those questions before i follow the venv route.
It
And just to add - the above comment is not 'ill meant' or angry.
If passing the message between the lines is the way PEP is written is deliberatly chosen by PEP writers because they wanted to avoid those kind of discussions anf giving all answers upfront (and moving forward with the PEP) - i can only sympathise with that, however you will not run away from those discussions as long as there are some good solutions for people and projects like ours - where we have to face some real consequences of those decisions.
I am happy to be the guinea pig to help you to move it one step further and make a showcase that yeah - venv approach is also applicable to Docker image building case. But I need help and answers - i am capable and experienced enough to discuss it and implement it and solve (together with you) any problems - but I simply need help and clear understanding that I know what I am doing and some kind of guarantee that it will not break in the near future.
Then we could even progress to the next stage ( and maybe even next PEP) where we - together will not have to 'write the message between the lines' and 'implicitly' but maybe write it 'explicitly' following the Python zen. I am all for promoting venv as the only way as long as we know we can handle all cases this way.
Alright, groking through this discussion now. Let's see what happened here, as I was actively avoiding engaging with this during my fairly-heavy work week.
So...
Or maybe even just changing the message to say "If you are in container, it's usualy OK to run pip as root". That woudl be more "factual".
Lead to...
If that's the position of the PIP maintainers that venv is the ONLY way, and forcing venv is mandatory, then i think we have no other way than to follow it
I disagree with the first statement, and also the second. I think the latter is a gross exaggeration and taking a carefully measured position too far in one direction or the other.
Here's my personal thoughts on this thing, presented in the form of quotes from earlier in this thread, because I don't wanna rewrite these things:
There's significant portion of users on other Linux-based platforms, who face the same issue, where messing up the system packages with pip can mean that they're unable to use their PC after a reboot.
It is still possible to modify system-package-manager installed packages, using pip inside a container. That can still break things in weird ways.
As it stands, there's a risk to running pip as sudo regardless of whether you run it on your local terminal, in a container, or on a remote machine. Outside of mitigating that risk (part of which is done by PEP 668), all that we can do is warn users about it; and that's what this message is doing.
I don't think there is a way to make it possible for experienced users to not see this warning while also making sure that it serves the purpose of getting inexperienced users to understand that they should not do this in general.
Aside: I already frown a bit every time someone uses "Pip" rather than "pip" or pip
, so... I definitely don't like "PIP". 😅
PEP 668 does not get into this too much because this is a topic people have strong opinions on, and discussing the topic presents an unnecessary risk of derailing the PEP discussion
Wait, I'm actively pushing to make explicit recommendations for include the PEP's protections containers in the PEP. They won't be requirements, but they'll be a normalative recommendation. See the discuss.python.org thread for what exactly I've said.
Or... are you talking about do-not-run-as-root recommendations in the PEP? If so, yea, that's unrelated to the PEP entirely. I don't think there's anything between the lines about that.
We cannot just instruct people to prepend activation for every RUN
pip
command they run
You don't have to. venv/bin/pip
does the right things.
Beyond that, I'm finding it difficult to follow what @potiuk said in his recent posts, which seem more emotionally charged than earlier ones; so... I'm going not respond to them. I would like to note that nothing super disruptive is changing anytime soon, so it's probably fine to come back to this discussion in like, a week or two / a month from now -- let's give people a bit of time to relax and tone down the conversation here.
I think (hopefully) it is not too emotional any more - I am really i b the state on how I can put the recommendations of always using Venv. I am still not ok withe the error message but i would like now to focus on how I can actually put 'always use venv' in practice. I'd love to focus on the technicalities without diving into emotions.
Side comment - i use correct spelling for pip now. Honestly thank you for expressing your emotions connected with using different spelling. Only because you expressed your emotions connected to that I had a chance to understand that and empathize with it. I would not have known otherwise.
We are all humans not robots - we do have emotions and i think it's ok to tell others how we feel so being emotional (in terms of expressing your emotions) is often much more important than hiding it because this might lead to misunderstanding. That's why I also not shy away from expressing my emotions usually.
make explicit recommendations for include the PEP's protections containers in the PEP. They won't be requirements, but they'll be a normalative recommendation. See the discuss.python.org thread for what exactly I've said.
Do i understand correctly that the PEP 668 is going to change when it comes to containers ? Can you please copy the right link ? The link you copied is just generic link to the 'discuss' site.
Or... are you talking about do-not-run-as-root recommendations in the PEP? If so, yea, that's unrelated to the PEP entirely. I don't think there's anything between the lines about that.
Yes i would also love to understand that. Seems that `root' and Venv are different things but the error message mixes the two and it was mentioned earlier in the thread that PEP 668 is the solution that will solve it. I would really love to get rid of the error now i really need some guidance as i am a bit lost - should I continue using root? Or should I use Venv? Or both? Can I copy .Venv for between the users with the guarantees with shared libraries that '--user' gives above? How to activate Venv in Docker? I think I need to get those answers to be able to act on them (they are nicely bullet-pointed above and i really need your help to understand the answers.
You don't have to. venv/bin/pip does the right things.
Ok. Let me rephrase what I understand from that sentence.
Does it mean that just prepending the right .venv/bin to the PATH to make sure that Venv bin is first on the PATH is the right solution for docker? I used it in the past that i simply used 'python' from the Venv bin when I could not 'activate' the Venv. This is what seems to be suggested by the https://docs.python.org/3/library/venv.html#creating-virtual-environments which mentions that you can run venv python interpreter separately, but still the only 'official' way of activating the venv script is to source the 'activate' script so it's not entirely sure that we can count on this for 'pip' commands (it does seem reasonable to assume that but i just want to make sure I am not missing some edge cases when I make the decision to move whole Airflow image to use the venv.
Now from what I understand you tell me that just by having 'pip' from the 'venv/bin' path (so if i put the .venv first on the PATH) it will work and behave in exactly the same way as if i activated the environment? Can you confirm that please ? Do i understand it correctly? Is this behaviour implicit or documented somewhere?
Beyond that, I'm finding it difficult to follow what @potiuk https://github.com/potiuk said in his recent posts, which seem more emotionally charged than earlier ones; so... I'm going not respond to them. I would like to note that nothing super disruptive is changing anytime soon, so it's probably fine to come back to this discussion in like, a week or two / a month from now -- let's give people a bit of time to relax and tone down the conversation here.
I am really into 'how can i do it now' no more emotions or need to relax (i am actually writing it from holidays because I know it will get some time to clarify some of the questions of mine and I just want to get good tech ical answers before I come back so that I can work on it as soon as I back (hence some spellings and shortcuts as i am writing from mobile.
I would really appreciate if someone from the PIP team looks at the - plain technical - questions I asked and guides me there.
J.
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/pypa/pip/issues/10556#issuecomment-949562116, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAERMI4MOSXMQ2F6RIXZMNDUIFGSXANCNFSM5FRV5SUQ .
Purely technical answers here. I'm not going to discuss further as I think a "cooling off" period is good. But I will respond with (what I percieve as) facts, in the hope that they will be useful to you.
Do i understand correctly that the PEP 668 is going to change when it comes to containers
PEP 668 has not yet been accepted. It's not even been submitted for approval. The Discourse link is where you should go if you want to discuss what it says prior to approval. Once it's approved, it won't change further without a follow-up PEP.
i really need some guidance as i am a bit lost
That's advice, not technical questions, so I won't respond on that.
Does it mean that just prepending the right .venv/bin to the PATH to make sure that Venv bin is first on the PATH is the right solution for docker?
That's a choice you can make, there's no technical, yes/no answer.
Now from what I understand you tell me that just by having 'pip' from the 'venv/bin' path (so if i put the .venv first on the PATH) it will work and behave in exactly the same way as if i activated the environment?
The executables in the virtual environment's "bin" directory do not need the environment to be activated to work. You can run them directly using their absolute path, or add the bin directory to $PATH
(that's all activating does anyway).
Is this behaviour implicit or documented somewhere?
It's standard virtual environment behaviour, I've no idea if it's documented anywhere TBH, but it's how they have always worked, and isn't going to change.
I hope this is useful.
Also i started searching for and reading some of the previous issues - mixing venv with '--user' and related issues - fascinating read lots of information and some issues closed in favour of others, i understand it's a complex subject and especially mixing with --systm and others.
And I sympathise with pip maintainers - i honestly understand why you are promoting venv, as the ways how you can mix things and get confusing behaviours. I really would love to help in this direction - to make venv THE way. I am even happy to build some kind of recommendations to people who build images on how they can do it easily to handle even complex cases like airflow to promote it better and - at sone point becoming really the only supported way.
if you could help with getting the answers i need now I think i can simply be helpful with that. I have - unlike you - not full context in my head and knowledge from all the past (and numerous) discussions you had. Please do understand the ignorance of mine (for now - i am building my knowledge) but i am now convinced venv is the only way and happy to help with that.
pt., 22 paź 2021, 15:48 użytkownik Jarek Potiuk @.***> napisał:
I think (hopefully) it is not too emotional any more - I am really i b the state on how I can put the recommendations of always using Venv. I am still not ok withe the error message but i would like now to focus on how I can actually put 'always use venv' in practice. I'd love to focus on the technicalities without diving into emotions.
Side comment - i use correct spelling for pip now. Honestly thank you for expressing your emotions connected with using different spelling. Only because you expressed your emotions connected to that I had a chance to understand that and empathize with it. I would not have known otherwise.
We are all humans not robots - we do have emotions and i think it's ok to tell others how we feel so being emotional (in terms of expressing your emotions) is often much more important than hiding it because this might lead to misunderstanding. That's why I also not shy away from expressing my emotions usually.
make explicit recommendations for include the PEP's protections containers in the PEP. They won't be requirements, but they'll be a normalative recommendation. See the discuss.python.org thread for what exactly I've said.
Do i understand correctly that the PEP 668 is going to change when it comes to containers ? Can you please copy the right link ? The link you copied is just generic link to the 'discuss' site.
Or... are you talking about do-not-run-as-root recommendations in the PEP? If so, yea, that's unrelated to the PEP entirely. I don't think there's anything between the lines about that.
Yes i would also love to understand that. Seems that `root' and Venv are different things but the error message mixes the two and it was mentioned earlier in the thread that PEP 668 is the solution that will solve it. I would really love to get rid of the error now i really need some guidance as i am a bit lost - should I continue using root? Or should I use Venv? Or both? Can I copy .Venv for between the users with the guarantees with shared libraries that '--user' gives above? How to activate Venv in Docker? I think I need to get those answers to be able to act on them (they are nicely bullet-pointed above and i really need your help to understand the answers.
You don't have to. venv/bin/pip does the right things.
Ok. Let me rephrase what I understand from that sentence.
Does it mean that just prepending the right .venv/bin to the PATH to make sure that Venv bin is first on the PATH is the right solution for docker? I used it in the past that i simply used 'python' from the Venv bin when I could not 'activate' the Venv. This is what seems to be suggested by the https://docs.python.org/3/library/venv.html#creating-virtual-environments which mentions that you can run venv python interpreter separately, but still the only 'official' way of activating the venv script is to source the 'activate' script so it's not entirely sure that we can count on this for 'pip' commands (it does seem reasonable to assume that but i just want to make sure I am not missing some edge cases when I make the decision to move whole Airflow image to use the venv.
Now from what I understand you tell me that just by having 'pip' from the 'venv/bin' path (so if i put the .venv first on the PATH) it will work and behave in exactly the same way as if i activated the environment? Can you confirm that please ? Do i understand it correctly? Is this behaviour implicit or documented somewhere?
Beyond that, I'm finding it difficult to follow what @potiuk https://github.com/potiuk said in his recent posts, which seem more emotionally charged than earlier ones; so... I'm going not respond to them. I would like to note that nothing super disruptive is changing anytime soon, so it's probably fine to come back to this discussion in like, a week or two / a month from now -- let's give people a bit of time to relax and tone down the conversation here.
I am really into 'how can i do it now' no more emotions or need to relax (i am actually writing it from holidays because I know it will get some time to clarify some of the questions of mine and I just want to get good tech ical answers before I come back so that I can work on it as soon as I back (hence some spellings and shortcuts as i am writing from mobile.
I would really appreciate if someone from the PIP team looks at the - plain technical - questions I asked and guides me there.
J.
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/pypa/pip/issues/10556#issuecomment-949562116, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAERMI4MOSXMQ2F6RIXZMNDUIFGSXANCNFSM5FRV5SUQ .
Thanks @pfmoore - we are definitely moving forward so thanks for taking time to answer some of my questions!
PEP 668 has not yet been accepted. It's not even been submitted for approval. The Discourse link is where you should go if you want to discuss what it says prior to approval. Once it's approved, it won't change further without a follow-up PEP.
That's a good news. I had the impression from earllier comments (pardon my ignorance here) that it is already agreend and being implemented. I think then it is even more important to clarify the issues of applying the .venv for tha case I am interested in mosty now to get my answers about (i.e. building the images with python dependencies with venv as first-class-citizen). I am super happy to bring whatever I find back to the discussion there and maybe even I will have a changed to influence the image/container part as I bring some experiences from converting Airlfow image to it.
That's advice, not technical questions, so I won't respond on that.
Let me just clarify then, because maybe that was not clear. the warning message is:
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
When I folow https://pip.pypa.io/warnings/venv I get redirected to https://docs.python.org/3/tutorial/venv.html - which is tutorial about virtualenv (which I know prety well). I looked at it again and there is no mentioning of root
user. And I know for a fact that you can create a venv as root
. So just wanted to make sure what is the recommendation (this is where I am confused). So simply speaking I would like to know whether the recommendaiton is:
1) I can runroot
user with venv ( the first part of the message explicitly states root
but "using virtual envronment" instead does not imply using differnt user so I am not sure)
or
2) I should use different user than root
and run it with venv
The reason why I want to know the answer is because I need to decide if I should create a separate user for the venv or whether I can run venv
as root
. I want to make sure that current and future versions of pip
will not print the warning if I use root
and venv
(or at least I would like to know what is the current intention - even it might change in the future).
So does the recommendation recommend me 1) or 2) - because I am a bit lost ?
Does it mean that just prepending the right .venv/bin to the PATH to make sure that Venv bin is first on the PATH is the right solution for docker?
That's a choice you can make, there's no technical, yes/no answer.
Fair point. I think the next questions will allow me to derive the answer myself.
Now from what I understand you tell me that just by having 'pip' from the 'venv/bin' path (so if i put the .venv first on the PATH) it will work and behave in exactly the same way as if i activated the environment?
The executables in the virtual environment's "bin" directory do not need the environment to be activated to work. You can run them directly using their absolute path, or add the bin directory to
$PATH
(that's all activating does anyway). It's standard virtual environment behaviour, I've no idea if it's documented anywhere TBH, but it's how they have always worked, and isn't going to change.
Yep. This part (about executables not needing activation) is i think quite explici in the docs of venv (and as I wrote it's reasonavle to assume they work this way). But my questions are a bit deeper. The pip
is not an ordinary binary. It does so much more than just run an interpreter. It can download source code and build shared libraries, copy those libraries and accompanying artifact files and generally "modify" the "installation environment". This is far more than usual binary or even "python" interpreter itself. My question is deeper than just "run" pip command. My question is about the resulting artifacts and whether I can rely on the ".venv" directory being cloneable (same as .local is when you use --user
flag. As I explained my case. So far I was using --user
in one image and copied the resutlting .local
directory to another image to get way (600 MB instead of 850 MB) significant savings in size. My question is - if you use pip
from the path on venv without activating the venv - will I be able to copy the resulting .venv
directory to another image and get all the needed libraries and dependencies transferred smoothly. Do I need to worry about anything (should the .venv be in exactly the same location or when I change the user, should I modify something there?). I was doing it with --user
and ,local
dir and there I had guarantee that it will work. I have > 500 dependencies in Airflow, and the last thing I want after I release the image to my users is that some obscure package which I had no chance to test will behave differently when I copy the .venv.
Can I count on that behaviour?
I hope this is useful.
Definitely moving in the right direction - but I think some of my questions need some clarification so that I am convince that I can follow the .venv
advice.
(and apologies for capital PIP @pradyunsg - I already corrected. unfortunately auto-correct corrected it to capital :(
So does the recommendation recommend me 1) or 2)
This is where we start to get frustrated with each other, so I'll say one thing and then stop. I suggest that you do too - @pradyunsg is right that this issue needs people to take a breather.
You seem to want very specific and precise advice. But the whole point of the warning is not to use pip as root if you aren't confident that you understand the implications and can judge for yourself. IMO, that means that if you need to ask for explicit rulings, you shouldn't run pip as root at all. To be clear - that's my opinion, not a recommendation, nor a "statement by the pip maintainers", or anything else. And it's my interpretation of the sense of the warning, not a claim that the warning says precisely that. You can have your own opinion, certainly. We'll simply have to agree to differ in that case.
(I didn't write the warning text, so I'm allowed to have my own views on what it might mean without them being "official" in any way).
apologies for capital PIP
FWIW, there's no official spelling requirement for pip. Personally, I treat it as a normal word and capitalise at the start of sentences. I know this annoys @pradyunsg (and I'm sorry for that) but "must always be in lowercase even at the start of sentences" names annoy me, so I guess we can't both be happy 🙂
Yes. I do expect precision indeed . This is why I chose '--user' flag initially because (from very precise and explicit docs) I've learned that I can do what I want and I had no surprises.
So when (and I am glad to follow it now) i got Warning that i should change it, i want to understand what are the consequences and ask for meaning of what i do not understand.
I try to paraphrase it and explain it with my own words how i understand it. More. I need to know the answers to serve my users as they will come to me with the same questions so i need to know the answers .
Is it really too much to ask to help with answering my questions coming from ambiguous messages?
I really do my best to precisely explain the use case and what i ask is help to understand what is not clear. I do it multiple times a.day when my users in Airflow.point out the ambiguities and my answer is usually 'great, can you please make PR to clarify' or 'yeah i will fix it, thanks for pointing out' or 'please take a look at the docs - it is already explained here' (usually because either the user or one of the committers fixed it last time when someone got confused).
What else can I do ? Do you really expect me (as a user of pip to follow all the discussions and understand all the nuances and take the risk of making my own judgements here where i simply ask if this and that will work this way or the other? How can I make sure that other who will come after me with the same problem will not again start the same discussion ? If you want to prevent similar discussions in the future, the.best this g you can do is to clarify what the meaning is - in docs, documents. Otherwise other people will again come here, open issues and continue annoying you.
I was full of hope (and we speak about the emotions again) that with my experience with docker and complexity of Airflow i can eventually help to clarify the recommendation and maybe even help with clarifying the PEP and bring some working examples how the recommendation can be put in practice.
I do not want to be begging for help.
But i encourage you and others to help me to answer my questions so that I can help you eventually.
It's that simple. Empathy, and understanding your users need is really important. And your users can help you with making your product better if only they get a bit of help. This is what I learned so far by last three years in open source.
Just that.
For everyenes information here. I believe my worry and questions about the .venv
behaving differently that .local
was justified. Unlike the .local
dir (with --user flag) you cannot just move it between the images to a different user (and especially to different arbitrary users) as easily as we could with .local
dir.
I am working on migrating Airflow images to venv and the first thing I stumbled upon - it seems that If you want to to install it via venv and make it available to all users, you need to make sure that when you create the .venv in one image you need to make sure that the virtualenv is copied with exactly the same path.
If I copy /root
created venv to to the airflow
user I got:
/entrypoint: /home/airflow/.venv/bin/airflow: /root/.venv/bin/python3: bad interpreter: Permission denied
Aparently, the venv (unlike --user
installation) stores the location of the python interpreter (and likely other binaries) in the .venv itself. This is solvable of course (we can make sure we always use the same path for venv rather than assume it is in home directory) but I think it's one of the things to clarify when we recommend image building solutions to move to venv. This is pretty standard practice to use multi-stage build when building python images so users might be surprised by this.
Luckily we have a pretty comprehensive set of tests that we can run on our images to see if they are still working as expected - but I find other things I will let you know (I also plan to join the PEP 668 discussion - @pradyunsg - could you please post the link to the discussion you mentioned before about container and PEP 668 where you advocated for the same kind of protections in containers? The link posted previously was a generic link to all discussions..
One other thing to note - seems that venv keeps the same code/shared libraries as .local - the size of the generated .venv folder is comparable with .local one:
du -h --summarize ~/.local
648M /home/airflow/.local
du -h --summarize ~/.venv
684M /home/airflow/.venv
PR created: https://github.com/apache/airflow/pull/19189
In the absence of clear guidances, I decided to use root
to create the venv and share it between the user in /.venv
virtual env (not in a HOME directory of any of the users) to be able to copy the venv between images. I am running all CI tests in the CI image - and it should give a better answer if everything looks good after switching to .venv but preliminary tests of production image, show that it passes all the tests (after I fixed a few problems).
I've learned a thing or two how to approach "buildng optimal size images while using venv", so I am looking forward to sharing that in the PEP and maybe even provide a very precise guidance to anyone who wants to follow "use venv in images" and argue why this is good if someone wants to discuss it. I will chime in in the PEP discussions when I got the CI green and merge the changes, so tha I can talk using practical example and learning.
Another (agin as I anticipated) problem (and might be good to document for anyone who hits similar problems):
Unfortunately, the venv
solution is different from the --user
when it comes to running python scripts via sudo
in the image.
The error here https://github.com/apache/airflow/runs/3985552602?check_suite_focus=true#step:7:52 is caused by this.
Reason: Unfortunatley sudo
does not preserve the PATH (even if it is run with -E
) which means that the trick with setting th PATH is not working for sudo commands (effectively what happens you run the "bare" python environment when you run sudo command (unlike the --user
which works across sudo
). I am looking for a reasonable way to always activate the venv even if you are using sudo
command.
Any suggestions appreciated.
Ok. Looks like the right combination is :
/etc/profile.d/setup_venv.sh
- with activating the venv (this will work nicely for sudo -i
and basically for all interactive loginssecurePath
in /etc/sudoers to add the venv
pathThen the venv should work same as -user
in all cases (at least in case)
Note allso that the other test failures in Airlow (for example this https://github.com/apache/airflow/runs/3985552779?check_suite_focus=true#step:6:7149 ) is also caused by switching to venv. We are using Python virtualenv operator that will create a new virtualenv dynamically if you want to dynamically add new dependencies. The problem there is (likely) caused by loosing the original venv packages when we are dynamically creating a new one. That was also preserved by the `--user
flag and we could rely on it. I am not yet sure how to solve that one but I will look at it next.
Again - any suggestions appreciated.
What's the problem this feature will solve?
I want to be able to manually remove the warning pip spews out during package installation in root environment:
Describe the solution you'd like
I want to be able to disable this warning through an environment variable like
Alternative Solutions
No in tool workaround known to me.
Additional context
We are all adults here, I know what I am doing and I do not want to see a warning every time I run my build system. Let me disable the warning by setting an environment variable. I do not want my users to think there is anything wrong my system just because of the pip tool spews out indiscriminate warning messages.
Code of Conduct