pypa / pipenv

Python Development Workflow for Humans.
https://pipenv.pypa.io
MIT License
24.78k stars 1.86k forks source link

pipenv injects secrets into Pipfile and lock file #3751

Open TheFriendlyCoder opened 5 years ago

TheFriendlyCoder commented 5 years ago

Issue description

When running pipenv in an environment which has a PIP_INDEX_URL environment variable defined, and that env var contains credentials to use when authenticating to the package index server, those credentials are injected as plain-text into the Pipfile and Pipefile.lock files.

Expected result

pipenv should not inject index URLs containing credentials verbatim into the config file.

Actual result

pipenv does inject index URLs containing credentials into the config file.

Steps to replicate

We work behind a firewall with a mirror of the global PyPI repository in our lab. This mirror requires authentication to access. To make this work properly / easily with pip we generally expect users to define a custom PIP_INDEX_URL containing the users' credentials along with the URL for the mirror.

Further, when using pipenv to create a new environment or install a package (ie: pipenv install requests), we end up getting a Pipfile that looks something like this:

[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true

[dev-packages]

[packages]
requests = {index = "https://<credentials>@<server_url>/pypi/pypi-virtual/simple",version = "*"}

[requires]
python_version = "3.7"

I debated whether this should be reported as a "bug" or perhaps just a feature request to have this behavior changed, however given the severity of the impact of this default behavior I thought it was better suited as a bug. The tools should never inject secrets into files that may be commited to version control and this behavior could be very easily overlooked by users of the tool, which makes me very hesitant to roll out this tool for widespread use on my team.


$ pipenv --support Pipenv version: `'2018.11.26'` Pipenv location: `'/usr/local/lib/python3.7/site-packages/pipenv'` Python location: `'/usr/local/opt/python/bin/python3.7'` Python installations found: - `3.7.2`: `/usr/local/bin/python3` - `3.7.2`: `/usr/local/bin/python3.7m` - `3.6.1`: `/usr/local/bin/pypy3` - `3.5.2`: `/usr/local/bin/python3.5m` - `3.5.2`: `/usr/local/bin/python3.5` - `2.7.16`: `/usr/local/bin/python` - `2.7.16`: `/usr/local/bin/pythonw` - `2.7.13`: `/usr/local/bin/pypy` - `2.7.10`: `/usr/bin/python` - `2.7.10`: `/usr/bin/pythonw` - `2.7.10`: `/usr/bin/python2.7` PEP 508 Information: ``` {'implementation_name': 'cpython', 'implementation_version': '3.7.2', 'os_name': 'posix', 'platform_machine': 'x86_64', 'platform_python_implementation': 'CPython', 'platform_release': '18.5.0', 'platform_system': 'Darwin', 'platform_version': 'Darwin Kernel Version 18.5.0: Mon Mar 11 20:40:32 PDT ' '2019; root:xnu-4903.251.3~3/RELEASE_X86_64', 'python_full_version': '3.7.2', 'python_version': '3.7', 'sys_platform': 'darwin'} ``` System environment variables: - `JENKINS_KEY` - `TERM_PROGRAM` - `TERM` - `SHELL` - `TMPDIR` - `Apple_PubSub_Socket_Render` - `VAULT_ADDR` - `TERM_PROGRAM_VERSION` - `PIP_INDEX_URL` - `OLDPWD` - `TERM_SESSION_ID` - `SDKMAN_PLATFORM` - `OBJC_DISABLE_INITIALIZE_FORK_SAFETY` - `USER` - `SDKMAN_CANDIDATES_API` - `PIP_INDEX_URL2` - `GITLAB_API_TOKEN` - `SSH_AUTH_SOCK` - `ART_KEY` - `LSCOLORS` - `PATH` - `PWD` - `JAVA_HOME` - `LANG` - `SDKMAN_VERSION` - `XPC_FLAGS` - `PS1` - `XPC_SERVICE_NAME` - `HOME` - `SHLVL` - `LOGNAME` - `SDKMAN_DIR` - `GROOVY_HOME` - `SDKMAN_CANDIDATES_DIR` - `JENKINS_SANDBOX_KEY` - `DISPLAY` - `_` - `__CF_USER_TEXT_ENCODING` - `PIP_DISABLE_PIP_VERSION_CHECK` - `PYTHONDONTWRITEBYTECODE` - `PIP_SHIMS_BASE_MODULE` - `PIP_PYTHON_PATH` - `PYTHONFINDER_IGNORE_UNSUPPORTED` Pipenv–specific environment variables: Debug–specific environment variables: - `PATH`: `/usr/X11R6/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/go/bin:/opt/X11/bin:~/Documents/node_modules/.bin` - `SHELL`: `/bin/bash` - `LANG`: `en_CA.UTF-8` - `PWD`: `*******` --------------------------- Contents of `Pipfile` ('******/Pipfile'): ```toml [[source]] name = "pypi" url = "https://pypi.org/simple" verify_ssl = true [dev-packages] [packages] requests = {index = "https://@/pypi/pypi-virtual/simple",version = "*"} [requires] python_version = "3.7" ``` Contents of `Pipfile.lock` ('*****/Pipfile.lock'): ```json { "_meta": { "hash": { "sha256": "343de2d326ba044580f1fbb17c5ecdf44c9002aa99934966d44a6ef43585c754" }, "pipfile-spec": 6, "requires": { "python_version": "3.7" }, "sources": [ { "name": "pypi", "url": "https://pypi.org/simple", "verify_ssl": true } ] }, "default": { "certifi": { "hashes": [ "sha256:59b7658e26ca9c7339e00f8f4636cdfe59d34fa37b9b04f6f9e9926b3cece1a5", "sha256:b26104d6835d1f5e49452a26eb2ff87fe7090b89dfcaee5ea2212697e1e1d7ae" ], "version": "==2019.3.9" }, "chardet": { "hashes": [ "sha256:84ab92ed1c4d4f16916e05906b6b75a6c0fb5db821cc65e70cbd64a3e2a5eaae", "sha256:fc323ffcaeaed0e0a02bf4d117757b98aed530d9ed4531e3e15460124c106691" ], "version": "==3.0.4" }, "idna": { "hashes": [ "sha256:c357b3f628cf53ae2c4c05627ecc484553142ca23264e593d327bcde5e9c3407", "sha256:ea8b7f6188e6fa117537c3df7da9fc686d485087abf6ac197f9c46432f7e4a3c" ], "version": "==2.8" }, "requests": { "hashes": [ "sha256:11e007a8a2aa0323f5a921e9e6a2d7e4e67d9877e85773fba9ba6419025cbeb4", "sha256:9cf5292fcd0f598c671cfc1e0d7d1a7f13bb8085e9a590f48c010551dc6c4b31" ], "version": "==2.22.0" }, "urllib3": { "hashes": [ "sha256:a53063d8b9210a7bdec15e7b272776b9d42b2fd6816401a0d43006ad2f9902db", "sha256:d363e3607d8de0c220d31950a8f38b18d5ba7c0830facd71a1c6b1036b7ce06c" ], "version": "==1.25.2" } }, "develop": {} } ```
TheFriendlyCoder commented 5 years ago

NOTE: one technique I had tested for keeping the high-level dependencies for a project in sync between the setup.py and the Pipfile was to put the definitions for the dependencies in the setup.py and do an editable install using pipenv, something like pipenv -install '-e .'. That resulted in a more subtle "bug" where the index URL with the credentials I just mentioned were also injected into the Pipefile.lock file. I say this is more subtle because most of the literature tells us we shouldn't be concerned really with the contents of that file because it is intended to be generated and managed totally by the pipenv tool ... and yet the file is intended to be committed to version control. The fact that the credentials for the index url can end up in this file would be even more succeptible to being missed and accidentally committed to version control.

TheFriendlyCoder commented 5 years ago

NOTE: I tried to manually work around this problem by editing the Pipfile by tring to force it to use the PIP_INDEX_URL environment variable using the env var expansion logic as follows:

[[source]]
name = "pypi"
url = "${PIP_INDEX_URL}"
verify_ssl = true

[dev-packages]

[packages]
requests = {index = "pypi",version = "*"}

[requires]
python_version = "3.7"

However, after making this change and then running pipenv install requests again, the contents of the Pipfile is actually overwritten and the full index URL is re-injected into the file, as in:

name = "pypi"
url = "${PIP_INDEX_URL}"
verify_ssl = true

[dev-packages]

[packages]
requests = {index = "https://<credentials>@<server_url>/pypi/pypi-virtual/simple",version = "*"}

[requires]
python_version = "3.7"
TheFriendlyCoder commented 5 years ago

Building upon my earlier comments, perhaps one easy solution for this problem could be to simply respect the use of the enviroment variable expansion in this case, and to preserve the Pipfile contents even when the PIP_INDEX_URL env var is found.

To put this another way, if I rename the PIP_INDEX_URL env var to PIP_INDEX_URL2, update the Pipfile to reflect the change, and re-run my test command, I get the expected behavior. Here is a sample Pipfile to illustrate what I mean:

name = "pypi"
url = "${PIP_INDEX_URL2}"
verify_ssl = true

[dev-packages]

[packages]
requests = {index = "pypi",version = "*"}

[requires]
python_version = "3.7"

With this Pipfile, I can run PIP_INDEX_URL2=$PIP_INDEX_URL && unset PIP_INDEX_URL && pipenv install requests and I get the desired results. The package is installed from the PyPI index url provided by PIP_INDEX_URL2, and the Pipfile remains unaffected by the install operation.

NOTE: This is not a scalable solution for general use, however. Our team manages dozens of Python libraries and applications, not all of which would be using pipenv. For those that do not we would still need the PIP_INDEX_URL env var defined so the traditional package tools continue to work. Further, it would not be practical or safe for us to continually define and undefine this env var depending on the project we are working on because accidentally forgetting to undefine the variable could have the adverse effect of having our private credentials being injected into the config files and risk being committed to version control afterwards. This would be too risky and fragile to be considered a viable workaround.

AlJohri commented 5 years ago

just to clarify, does this issue exist if you use a ~/.pypirc file for the index servers?

Screen Shot 2019-05-21 at 11 00 21 PM

uranusjr commented 5 years ago

Yeah, you’re hitting an implementation detail imposed by how pip’s set up. The only solution is to choose another environment variable other than PIP_INDEX_URL. More specifically, pip (and by extension Pipenv) reads a number of environment variables as configuration, and you would encounter some funky behaviour if you use them for other purposes. As a rule of thumb, avoid variable names with prefixes PIP_* and PIPENV_* unless you have a very specific reasons.

TheFriendlyCoder commented 5 years ago

@AlJohri I'm not sure I quite understand. The .pypirc file is only used when publishing packages, so far as I understand. It is not used when pulling/installing packages. Further, the .pypirc file is typically committed to version control, so it would be a bad convention to store credentials in that file.

Again, if I am mistaken with any of this just let me know.

TheFriendlyCoder commented 5 years ago

@uranusjr I'm not sure I understand your suggestion. In order to make pip work correctly with a pypi mirror that requires authentication you need to either define an environment variable named PIP_INDEX_URL which includes the credentials embedded in it (ie: https://user:pass@servername), or add the index URLs with credentials to the pip.conf file. There are pros and cons with each of these approaches, but we typically use the environment variable approach.

Further, if we want to be able to build projects which use pip directly then we'd need to keep the PIP_INDEX_URL defined globally on our dev machines. However, doing so effectively "breaks" pipenv because the credentials found in the environment variable automatically get injected into the Pipfile.

So, again, unless I'm misunderstanding something, I don't see how "avoiding variables with PIP_ in their names" achieves anything. Naming the environment variable anything else will effectively break pip, and naming it PIP_INDEX_URL effectively breaks pipenv.

uranusjr commented 5 years ago

Sorry, I elided some information in my previous post. Pipenv uses pip internally, and in order to implement [[source]], it also (needs to) set PIP_INDEX_URL to pass the value to pip. This means that you setting it yourself can break the internal communication, resulting in the awkward behaviour you observed.

There are multiple solutions to this problem, all on the same premise that you must avoid that environment variable to avoid stepping Pipenv’s toes. One is suggested above by @AlJohri—set up pip with pip.conf instead of environment variables. This is the best approach if you have non-Pipenv projects that also rely on the same system-wide index configuration. If you don’t, just pick a different environment variable name, and your [[source]] approach would work.

darrenmeehan commented 5 years ago

Hi @TheFriendlyCoder

Instead of putting the whole url for your internal mirror using PIP_INDEX_URL break out the credentials into separate environment variables.

sources": [
            {
                "name": "internalmirror",
                "url": "https://${USERNAME}:${PASSWORD}@foo.bar/simple",
                "verify_ssl": true
            }
]

Edit: Above code is in the format from a Pipfile.lock I have using this setup.

Below is how your should setup your Pipfile

[[source]]
"url": "https://${USERNAME}:${PASSWORD}@foo.bar/simple",
verify_ssl = true
name = "internalmirror"
TheFriendlyCoder commented 5 years ago

@Darrenmeehan You are correct that your suggestion does work around the problem I've described here. However it introduces a second problem: the index URL now needs to be duplicated both in the pipfile and in the global environment. In order for certain native / shell tools to work we need to have the PIP_INDEX_URL env var defined globally. This would result in the URL being duplicated both in the environment AND in the pipfile. This is less than ideal (ie: what if you want to redirect your repo elsewhere depending on your geo / network connection? You'd need to remember to update both.)

TheFriendlyCoder commented 5 years ago

@uranusjr Thanks for the suggestion. There is at least one case we've hit where the pip.conf file is not suitable: docker builds. If you need to build a Dockerfile locally that needs to pull Python packages from a custom pypi repo you need to inject the URL - including credentials - into the Docker build context. There are ways to bind-mount the users' home folder into the docker container at build time, but this potentially has other unwanted side effects (ie: sharing other files from the home folder, etc.) plus the command line interaction is a bit more complex (ie: problems for users who aren't as versed in Docker). An easier approach is to inject the PIP_INDEX_URL into the Docker context via environment variables something like docker build --build-args PIP_INDEX_URL=$PIP_INDEX_URL. Also, in our case, we have some boilerplate tooling that does this env var inject auto-magically so the whole process is transparent for our users which is even better. They can set PIP_INDEX_URL in their global environment once and the tooling "just works".

Also, your proposal still results in multiple sources of truth for the URL and credentials. You need to have the info duplicated in both the pip.conf and the pipenv file, which is less than ideal for the same reasons I mentioned earlier.

If we could have it so the single source of truth for the connection parameters could be loaded from the environment variable that would be preferable.

matteius commented 2 years ago

I believe this is resolvable now -- but let me know if its still an issue on pipenv==2022.8.19

deronnax commented 11 months ago

we want to let you know that we reproduced with recents version (2023.9.8). In an empty directory, set PIP_INDEX_URL with credentials and then do a pipenv install requests and pipenv will copy the index url with the secrets in the Pipfile and Pipfile.lock.

BPDanek commented 8 months ago

@deronnax any chance you could post an example? Would also be wonderful to add to docs in the future

deronnax commented 8 months ago

Nop, we moved away from Pipenv months ago, sorry (for PDM).

Alexander-Serov commented 6 months ago

Hello, Just to confirm: I see this issue is marked as resolved, but this behavior is still present on pipenv, version 2023.10.3. So what's the solution?

For info: we use pip.conf to store the extra index with the private token (/Users/<user>/Library/Application Support/pip/pip.conf on Mac). Putting the token into the Pipfile is impossible because the token will get committed.

matteius commented 6 months ago

@Alexander-Serov if I recall correctly, with the current behavior the secrets needs to be specified in env vars referenced by PIP_INDEX_URL and not within PIP_INDEX_URL directly -- similar to what is recommended in https://pipenv.pypa.io/en/latest/credentials.html

Opening this back up for now though in case there is a better solution.

Alexander-Serov commented 6 months ago

Thanks, @matteius, I will try out the env variables. And thanks for reopening: after all, the unconditional injection may be dangerous for non-advanced users who will then commit the token.

frytoli commented 5 months ago

I can confirm that installation of a private package with an environment var works:

export GH_PAT=some-secret-token
pipenv install https://${GH_PAT}@github.com/user/repo.git#egg=foo

However, the un-masked variable is written both to stdout and Pipfile/Pipfile.lock. Subsequently I was able to simply replace the un-masked variable with the env var as described in the docs. I.e. in Pipfile/Pipfile.lock: https://some-secret-pat-here@github.com/user/repo.git -> https://${GH_PAT}@github.com/user/repo.git. Additionally, a clean installation with the edited Pipfile (as long as GH_PAT is set) works:

export GH_PAT=some-secret-token
pipenv install

I'm not sure if this process can be streamlined, as devs could easily forget to manually edit the files before committing and printing the un-masked token to stdout isn't desirable. But for now, this will be how I'm handling the situation. Thanks pipenv team!