Closed chris48s closed 12 months ago
Hey @chris48s,
I'm open to collaborating on the integration and am here to help ☺️
For context, we currently handle 5k to 7k requests per hour. I noticed from your issue that you're redirecting 8k requests hourly to pypistats, which is over double our current volume.
Before proceeding, I need to assess our server's capacity. Though it's feasible to expand pepy's capacity, I'd prefer not to due to potential cost increases. I'll assess this once I'm back from vacation.
Could you please clarify a few things:
Thanks for your cooperation ☺️
Hi. Just acknowledging I've seen your post but I haven't had a chance to reply yet. I'm aiming to reply with answers in the next couple of days. Cheers
For context, we currently handle 5k to 7k requests per hour. I noticed from your issue that you're redirecting 8k requests hourly to pypistats, which is over double our current volume. Before proceeding, I need to assess our server's capacity. Though it's feasible to expand pepy's capacity, I'd prefer not to due to potential cost increases. I'll assess this once I'm back from vacation.
I wouldn't expect us to immediately send that kind of traffic your way. We've reached that level of usage with pypistats gradually over many years of carrying the day/week/monthly badges. I wouldn't expect to add a total downloads badge and immediately have that level of users. On day one, the traffic will be close to zero. PyPI badges are some of the most popular services on shields.io though.
In terms of keeping usage down, the main thing we can do is cache the badges downstream at the CDN. This means that badges embedded in the README of a popular project are only requested periodically. They mostly get served from cache. Our default max-age for a downloads badge is 20 mins. Given you are only updating the data once per day, I'd suggest we should set a much longer max-age for pepy. That should keep the traffic lower. Side note: They've never complained about it, but thinking this through and writing this up has made me realise pypistats are also only updating daily and we haven't customised the default :grimacing: , so I am going to submit a PR which will also reduce the amount of traffic we're sending their way.
Do you experience any peak traffic times we should be aware of?
Our demand curve is pretty predictable. We serve most traffic during working hours for Europe and North America and least when it is daylight over the Pacific Ocean. We also see a dip on the weekends. We scale our own infra based on scheduled events rather than in response to traffic.
Is it possible for me to introduce an API key for shields.io?
Short answer: Yes. Slightly longer answer:
Are you primarily interested in summary stats (total, monthly, weekly)? If so, I could set up a dedicated endpoint to reduce the database load.
I think the only number we would want from pepy is total_downloads
. If you wanted to set up a more efficient endpoint that only returns that, that would be cool.
Keep in mind, Pepy is provided as a best effort service. Would any downtime be a significant issue for you?
In general we try to avoid adding badges for services which we know to be unreliable. It provides a poor experience for users and generates support requests for us. That said, there isn't like a minimum uptime threshold or anything. If you're regularly experiencing a lot of downtime, I'd be hesitant to add this. If you just do your best but don't provide an SLA, that's fine. Shields is also a volunteer run service.
One important thing to note -- is it still the case that PePy includes downloads from all sources? That is, from PyPI and from all mirrors (such as bandersnatch, z3c.pypimirror, Artifactory, and devpi)?
For example, see https://github.com/psincraian/pepy/issues/164 where people have noticed the PePy numbers are much inflated compared with pypistats, for which most endpoints are without mirrors (and one endpoint includes both with and without). See their FAQ.
If mirrors are included by PePy, can an endpoint be provided for Shields.io that only gives PyPI numbers?
If not, can Shields.io be careful not to misleadingly label the badge as a PyPI one, and name it some other way?
PS Thank you both for all your work on PePy and Shields.io, they're both excellent tools! :clap:
This point about including/excluding mirrors is noted in https://github.com/badges/shields/issues/4319#issuecomment-1682919057
I've added an additional note on it to https://github.com/badges/shields/issues/4319#issuecomment-1697996155
Let me try answer from the phone:
Ok, so then this will apply to only new badges. I think it will be much easier for me to predict the traffic and see if the service is struggling.
Perfect 👍 similar to what I observed then.
Mainly, my idea is to do rate limiting. I would rather not have unknown traffic overloading the service. I can put some higher limit for shields, like 10x of the current traffic.
Likewise, I can still have the endpoint public but with a lot lower threshold, like 1 request per second. Will this make it easier for your CI?
Perfect. Given on what you said, I think you can rely on the current endpoint and if I add a new one I can raise a pull request on your project ☺️
Understood 👍 I think we are aligned on terms of SLA. Our SLA for the last year has been >99%,
@hugovk I know that people is interested into downloads by installer, but with the survey that I did in June it's not the top priority for the persons who answered it.
I will focus on what the most people is interested in, more historical data, and then implement this probably ☺️
Hello.
We implemented these badges a few months back.
At the moment our usage is still very low.
We recently started getting 401 Unauthorized
responses {"message":"Invalid API Key"}
(reported in https://github.com/badges/shields/issues/9730 ) so I guess you've implemented API keys. How could we get one?
Hey @chris48s
Sorry, I thought this wasn't done yet. You only need to
Let me know if you have any questions or problems ☺️
If not I will close this issue
Thanks. Sorry. I should have closed this after we merged https://github.com/badges/shields/pull/9564
I've created an account, made some keys, and checked they work. I won't have a chance to make the code updates until the weekend but it should be straightforward.
I'll close this now. Cheers.
Hello. There's a thread over on the shields.io repo about adding a PyPI Total Downloads badge using pepy as the source
https://github.com/badges/shields/issues/4319
Before taking that conversation further, I wanted to open an issue to discuss because:
Are you able to give us an initial indication of whether you'd be happy with us adding this? Cheers