Closed bazzargh closed 3 months ago
Just some more notes about what I'm up to (since you asked in the README):
We currently use a private repo which proxies to pypi as well as holding our internal packages. That repo got noticably slower after an upgrade and we traced it back to https://github.com/pypa/pip/blob/038dc755061d6918288eae97fbec363df3607366/src/pip/_internal/index/collector.py#L141-L154 - pip will only ever use etags for caching index files for simple repositories, if you don't serve them the indexes are re-downloaded every time. With multiplatform builds and dozens of dependencies, this adds hundreds of seconds to a build; apparently our server stopped sending the etags. simple-repository-server doesn't send them either, but it looks like an easy fix.
Investigating this led us to other problems with our repo: the lack of PEP-658 metadata means that tools have to use fallback solutions to do dependency resolution. uv for example uses http range headers to grab the relevant metadata if it's not present as a separate path. Our server... supports neither PEP-658 or Http Range headers.
The final straw is dependabot: it uses the warehouse json api https://github.com/dependabot/dependabot-core/blob/main/python/lib/dependabot/python/update_checker.rb#L267-L276 to fetch metadata instead of PEP-658, no matter what, so fails on every private registry except warehouse (AFAIK). If I can't get them to fix that code, I need to fix my server.
The code here is nice and simple and it looks easy to add these features; it's unclear to me how best to set simple-repository-server up with a local repo that supports authenticated uploads, but it's trivial to set it up with multiple remote backends, and to serve a PEP-503 only simple repository straight out of s3 which this proxy can upgrade on the fly to support metadata and the other protocols.
The dockerfile is just a hack. Our instance would run in EKS, but I've not tried setting it up yet.
Is there another fix coming/possible other than downgrading simple-repository
to below 0.5.0?
You could try to implement whatever the change was that didn't get uploaded.
I've a good guess at what the missing code dump included; one possible reason for switching to httpx from aiohttp is this bug https://github.com/simple-repository/simple-repository-server/issues/4#issuecomment-2131213240 - that if you talk to an upstream with gzip enabled the response size won't match the content-length header. httpx contains iter_raw, so you could stream the original response. https://www.python-httpx.org/quickstart/#streaming-responses
I didn't pursue that as I don't think it's correct either - if the client didn't say it would accept gzip, we shouldn't stream that. Stripping out unwanted headers seemed easier. In the setup I use I have nginx caching wrapped around simple-repository-server configured to serve stale responses while fetching and to fetch in the background ignoring client disconnects, this gives me fast responses without streaming (at the expense of the cache not being up to date all the time; not a big concern as tagged wheels should be immutable)
But to put it more simply: the current version of the simple-repository-server
is just broken and cannot be used at all since simple-repository
was updated 0.5.0?
It is not like I am doing something wrong?
you're correct.
There is some really interesting content in this issue/thread, thanks for the background @bazzargh. I've just released v0.5.0 which should address the actual issue at hand - please let me know if not though.
So I can at least say that the server is now properly starting up ;)
Should be fully resolved by v0.6.0.
The feedback on the usage is highly valuable, so I've made a new label to track it.
The last update to simple-repository changed the arguments for HttpRepository, and this library isn't synched up:
I'm running this with my own Dockerfile I threw together to test it:
And then just
docker build -t simple .
. I'm building this way because I wanted to play with patching the server a bit, like to add etag support on PEP-503 index files (this has been a problem for us with artifactory), and the warehouse json api (which dependabot uses instead of PEP-658, PEP-691 even when talking to private registries), and maybe http range header support (which uv uses)Anyway, the solution if this bites you is in setup.py change
"simple-repository",
to"simple-repository<0.5.0",