Closed dstufft closed 5 years ago
Maybe we should go the generic https://pypi.org/stats/ and add more over time starting with this one.
Lets start with:
👍
I would be highly appreciative of a blacklist that has the biggest 100 projects. This would probably save a ton of space.
This is live - Just tweaking some cache config: https://pypi.org/stats/
Is there anything left to do for this issue or shall we announce it on distutils-sig and close the issue?
@brainwane I think this is done!
Yeah the initial stuff is all done here. I may add more stats one day.
hmmm I guess there is an open question. do we want to commit to keeping this around by documenting it? both where to find it and the alternate JSON representation available when sending Accept: application/json
?
The question basically boils down to if we want this to be an interim/internal solution for bandersnatch users... or "own it" until we create a better replacement endpoint.
I'm happy to document it.
What does "own it" mean? I don't have preference where the API endpoint is. This was @dstufft's suggestion as to where to put it. What are the alternatives you're thinking?
The endpoint is useful primarily for bandersnatch users and other mirror clients. If we document it and "publicize it" we'll want to ensure that it continues working until we begin and complete the process of deprecating it. Additionally changes to this endpoint will have to remain backward compatible.
@cooperlees adding a page similar to https://github.com/pypa/warehouse/blob/master/docs/api-reference/json.rst is probably good!
Resolved by #5072.
What's the problem this feature will solve?
Miroring PyPI currently takes > 2TB of storage, and that is continuing to grow, some mirroring tools have the ability to blacklist projects from being mirrored, but it's difficult to know which projects should be targeted for blacklisting without insight into which packages take up the most space.
Additionally, as operators it can be useful to see if particular packages are consuming more or less of the total space used by PyPI.
Describe the solution you'd like
Add metrics that indicate the top N packages by total space used.