devpi / devpi

Python PyPi staging server and packaging, testing, release tool
https://doc.devpi.net
887 stars 135 forks source link

Whitelisting packages that can be mirrored from PyPI #198

Closed devpi-bot closed 5 years ago

devpi-bot commented 7 years ago

Hi, I'm looking for an option that would allow me to say that only specified PyPI packages can be mirrored on a DevPI server instance. I initially thought that I could use pypi_whitelist for this, but it turns out it does a bit different thing...

Basically, it should work like this (note that I'm using word "whitelist" in sense of my proposal, not in the current meaning from "pypi_whitelist"):

My usecase is deployment of DevPI instance for Fedora users, the broader concept is summarized at [1]. I basically need to mirror just packages that went through a Fedora review (meaning they have only certain acceptable licenses, that they're not malicious, etc). On my end, I'll create a script that will go through Fedora packages and generate list of PyPI names; on DevPI end, I need the ability to accept this list and only mirror these given packages.

I'd be happy to work on a patch for this, but I first wanted to discuss this issue and see if you're ok with it and how you'd approach implementation.

Thanks a lot.

[1] https://fedoraproject.org/wiki/Env_and_Stacks/Projects/LanguageSpecificRepositories

devpi-bot commented 7 years ago

Original comment by @fschulze

We thought about something similar, which relates to this. You would be able to set a requirements.txt like property on an index and it would only server those packages and versions. If we allow package names without a version, then your usecase would work as well. Not sure about the root/pypi part though, @hpk42 would have to chime in for that.

devpi-bot commented 7 years ago

Original comment by @bkabrda

Thanks for the response! Yeah, setting a requirements.txt-like property on index would be even more awesome. I guess our combined usecases would be satisfied if we could specify something like this for an index (i.e. a non-versioned package, a package limited to multiple versions, a package limited to precisely one version):

six
django>=1.6
django-debug-toolbar==1.2.3
devpi-bot commented 7 years ago

Original comment by @hpk42

In principle i am fine with the idea of this addition. Could you write some preliminary docs on how it would be useable from the "devpi index" command? I guess devpi-client could take a requirements.txt file and push its contents to the server where it would end up as an index property as a list of lines or so. I guess we could allow to put this property on root/pypi but maybe it's simpler and easier to implement to only allow it on the "normal" index type. root/pypi would thus always be a full mirror and deriving indexes can add a requirements style "filter". If you agree i'd kindly ask you to post a draft of how one would interact with the feature and its effects to the devpi-dev mailing list so that we have a UI we all agree on.

devpi-bot commented 7 years ago

Original comment by @sYnfo

Thanks for taking the time to comment. :) I'll try to write up a draft by next week and post it to the mailing list.

devpi-bot commented 7 years ago

Original comment by @sYnfo

FYI I've posted the draft to the mailing list [0], hope it's fine. :)

[0] https://groups.google.com/forum/#!topic/devpi-dev/py-B9kwaK5Y

devpi-bot commented 7 years ago

Original comment by @hpk42

as this issue went up through votes i'd like to add that i like the idea of having a requirement.txt style file define what is shown/mirrored on an index, similarly to how Matej has described it in his post. FWIW If people want to accellerate the arrival of this feature some funding for Florian or my work on this would help -- we do take funding for supporting company deployments and fixing bugs/features -- so far four companies have done that. contact holger at merlinux eu if you are interested. Otherwise PR's welcome :)

Thorbijoern commented 7 years ago

I've tried to use the pypi_whitelist or mirror_whitelist feature. I had no success but it could be really useful to automatically add all the requirements or depedencies of whitelisted or uploaded packages. e.g. you have a index for your packages you are developing and when you upload a package, devpi reads the setup.py and adds all dependencies to the whitelist or you know which main packages you need, whitelist them and devpi automatically adds all the dependencies of the already whitelistet packages also to the whitelist and mirrors them.

I think the documentation on mirror_whitelist and pypi_whitelist is a little bit unclear and should maybe be extended, I dont know the differences of each and it is unclear how to use it properly. If the feature discussed here is really called filter, like in the draft in the mailing list, the documentation is lagging and there is no information about it on the cli.

notEvil commented 6 years ago

Sry for bumping, but I have a similar need .. for a blacklist. That is, to "resolve" name conflicts of dependency links with PyPI when using pip(env) install. Any chance this finds its way into one of the next releases?

fschulze commented 6 years ago

@notEvil not sure what you mean. Could you provide a concrete example?

notEvil commented 6 years ago

As for the use case: when pip installing packages with dependencies in private VCS repositories by means of dependency links, pip (or any sub process) looks up dependencies on PyPI and if found, installs them. This means, its impossible to use package names that are already used on PyPI, and even if not, may be used in the future. It seems to be a common issue. The solution that I would prefer is to use devpi as filter, forcing pip to install conflicting from dependency links.

A first try, adding if project in ['{normalized name of package}']: return to op_sro_check_mirror_whitelist in model.py was successful. Of course, I would prefer a professional and tested solution :)

fschulze commented 6 years ago

@notEvil I'm still not sure I understand. What you describe sounds to me like the default devpi behaviour. If you have an index with root/pypi as the base and mirror_whitelist is empty, then if you upload a package that exists on PyPI, like requests, only your own upload is visible on your index for requests. Only if you add requests to mirror_whitelist will the other releases for requests from PyPI be listed together with your own upload.

notEvil commented 6 years ago

True, but with a subtle difference: I would need to upload the packages to the index. When using devpi as a central service, this would work. However, I would like to temporarily set up devpi as proxy + filter, for instance in a docker instance on Bitbucket Pipelines. Uploading packages to this temporary instance changes the package hashes and pipenv fails to install from Pipfile.lock.

fschulze commented 6 years ago

@notEvil how do you upload the packages? If you upload the same file the hash shouldn't change. You can also push releases from one index to another, even from root/pypi into your own: devpi push --index root/pypi package==1.0 my/index

I still don't get why a blocklist helps here. If you have a Pipfile.lock then only the pinned packages are used anyway.

Try to provide a concrete, step by step example. With package names, what comes from where, what and where it's pinned and at what exact place the blocklist would help.

notEvil commented 6 years ago

Lets say we have two packages A and B. In setup.py of A there is install_requires=['B'] and dependency_links=['git+ssh://git@provider/repo@ref#egg=B-0&subdirectory=relative/path'] and there is already a package called B on PyPI. Then, when running PIP_PROCESS_DEPENDENCY_LINKS=1 pipenv install 'git+ssh://git@provider/repo@ref#egg=A-0&subdirectory=relative/path' it will install B from PyPI instead of from the repository.

Also, when I setup devpi from scratch, start the server, upload B as user to index, run pipenv --python 3.6, adjust the url in Pipfile to http://localhost:3141/user/index, run PIP_PROCESS_DEPENDENCY_LINKS=1 pipenv install 'git+ssh://git@provider/repo@ref#egg=A-0&subdirectory=relative/path' (works), tear everything down and redo all the steps except for ... pipenv install ... do PIP_PROCESS_DEPENDENCY_LINKS=1 pipenv sync, the hash check fails for dependency B.

So blacklist would help because then I don't need to upload B but just blacklist it so that pip (called from within pipenv) will be forced to process the dependency links.

fschulze commented 6 years ago

As far as I can tell dependency links are a deprecated feature of pip, see https://github.com/pypa/pip/issues/2023 for a discussion on that and links to tickets for a replacement. Since you already use Pipenv.lock, I think you should add the dependency link in there.

notEvil commented 6 years ago

I see. I didn't want to discuss the issues with dependencies on private repositories more than necessary, because even after a lot of research there doesn't seem to be a satisfying solution. I don't know about editing Pipfile.lock as it is managed by pipenv. Can you point me in the right direction?

Anyway, imo a blacklist would be a great solution for the name conflict issue, and something that isn't too much trouble when implemented correctly.

edit: dependency links are not deprecated as of pip v10 because there are actual use cases for it

fschulze commented 6 years ago

Do you have a link to info on the undeprecation in pip 10?

If you add the dependency to Pipfile, or install the package with pipenv and the git URL, then it should be properly added.

notEvil commented 6 years ago

I remembered incorrectly, sry. Its not officially undeprecated, but has been readded and is still available in pip 10, see https://github.com/pypa/pip/issues/4187.

Regarding your proposal: if I understand correctly then I should install dependencies explicitly. That's obviously unacceptable.

fschulze commented 5 years ago

See https://pypi.org/project/devpi-constrained/ If that doesn't solve your use case, then either open a new ticket at https://github.com/fschulze/devpi-constrained/ if the plugin can be fixed/enhanced, or reopen this ticket/create a new ticket here.