Open shadiakiki1986 opened 4 years ago
I don't plan to add this feature, maybe I will add a similar one to list the source of downloads. If you take a look at your downloads in BigQuery you can see the following results:
row | details_installer_name | downloads |
---|---|---|
1 | Browser | 122 |
2 | pip | 71 |
3 | requests | 96 |
4 | null | 48 |
5 | bandersnatch | 2886 |
As you can see here the bandersnatch mirror has 2866 downloads. It seems quite a lot, but the mirror can be installed locally link. So if I have a mirror locally and I install your packaging from the mirror, this download will not be took into account.
What I normally do is filter for only pypi in my package query
On Wed, Oct 9, 2019, 21:01 Petru Rares Sincraian notifications@github.com wrote:
I don't plan to add this feature, maybe I will add a similar one to list the source of downloads. If you take a look at your downloads in BigQuery you can see the following results: row details_installer_name downloads 1 Browser 122 2 pip 71 3 requests 96 4 null 48 5 bandersnatch 2886
As you can see here the bandersnatch mirror has 2866 downloads. It seems quite a lot, but the mirror can be installed locally link https://bandersnatch.readthedocs.io/en/latest/mirror_configuration.html. So if I have a mirror locally and I install your packaging from the mirror, this download will not be took into account.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/psincraian/pepy/issues/164?email_source=notifications&email_token=ACAA5BA2BOI47RYZNY6L5V3QNYL6TA5CNFSM4I6QXUR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAYYZBI#issuecomment-540118149, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACAA5BDCBW6GOC4JRULMEPTQNYL6TANCNFSM4I6QXURQ .
Filter for pip* (typo)
On Wed, Oct 9, 2019, 22:05 shadi akiki shadiakiki1986@gmail.com wrote:
What I normally do is filter for only pypi in my package query
On Wed, Oct 9, 2019, 21:01 Petru Rares Sincraian notifications@github.com wrote:
I don't plan to add this feature, maybe I will add a similar one to list the source of downloads. If you take a look at your downloads in BigQuery you can see the following results: row details_installer_name downloads 1 Browser 122 2 pip 71 3 requests 96 4 null 48 5 bandersnatch 2886
As you can see here the bandersnatch mirror has 2866 downloads. It seems quite a lot, but the mirror can be installed locally link https://bandersnatch.readthedocs.io/en/latest/mirror_configuration.html. So if I have a mirror locally and I install your packaging from the mirror, this download will not be took into account.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/psincraian/pepy/issues/164?email_source=notifications&email_token=ACAA5BA2BOI47RYZNY6L5V3QNYL6TA5CNFSM4I6QXUR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAYYZBI#issuecomment-540118149, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACAA5BDCBW6GOC4JRULMEPTQNYL6TANCNFSM4I6QXURQ .
For what it's worth, I can also chime in to confirm that my download stats on pepy.tech are much, much higher than they are (or were) on pypistats.org.
That's all I wanted to say. No need to reply to this post.
The remainder of this message is a discussion which is not directly relevant to this thread, but I wasn't sure if it was appropriate to start a new issue. Feel free to ignore.
Before I used pypi, when my software was hosted on my own web page, the majority of downloads came from the same few IP addresses. (For example, I remember that one IP address downloaded my software over 10000 times. This was back when it was legal to keep track of visitor IP addresses.) Is it possible to use BigQuery to estimate the number of unique users (by discarding downloads from the same IP)? (Forgive me. I know nothing about BigQuery.)
When I used pypistats.org, it was able to show what version of python the users who downloaded my project were using (eg 2.7, 3.5, 3.7, etc...). This was interesting, but it's not essential. I only mention this here because it seemed that (even after excluding downloads from mirrors), the majority of downloads for my small project were from users whose python version is "unknown" and whose OS is also "unknown". Are these downloads legitimate? Should we exclude them?
Thanks for creating this service.
I also think it would be useful to have the option to choose the type of stats! Have there been any updates on this or are there any plans to add this in the future?
Hi @laurahanu, currently we are saving download stats without mirrors. Now we need to make changes to the API and to the frontend app :-)
Hi @psincraian, thanks for the reply and good to hear! Looking forward!
currently we are saving download stats without mirrors. Now we need to make changes to the API and to the frontend app :-)
@psincraian I'm also looking forward to that, and thanks PePy as a whole!
I just wanted to add some thoughts on this, hopefully not too off-topic. I know these are not trivial issues and I' m aware of the discussion on why PyPI doesn't include stats themselves. And I imagine these issues don't matter much for packages with a large number of downloads.
As you can see here the bandersnatch mirror has 2866 downloads. It seems quite a lot, but the mirror can be installed locally link. So if I have a mirror locally and I install your packaging from the mirror, this download will not be took into account.
Since the mirrors seem to download all files, they might inflate a lot the numbers for packages with few users but binary wheels for various Python versions and platforms. I believe the total without mirrors will help a lot in those cases.
For instance, using BigQuery directly* a few weeks ago, one of my packages had:
pip
installer and details(=I was using the old `downloadstable for this, not
file_downloads`)
@jewettaij mentioned:
the majority of downloads for my small project were from users whose python version is "unknown" and whose OS is also "unknown". Are these downloads legitimate? Should we exclude them?
Besides those, which usually reflect that the fields are null in the BigQuery table, I noticed some other weird things. For example, I'm not sure how the "country_code" is filled in the BigQuery data, even when restricted to "pip" as the installer. For my niche package, I noticed from the data that country_code=US is disproportionally larger than everything else, so I wonder:
Hi @psincraian, have there been any updates with the api or on the front end side? Otherwise, is there a timeline for when this would be included?
Hey there. Awesome project. Is it possible to get a badge from pepy without the mirrors? For my project, the mirrors stats are much larger than the non-mirror ones because it's still a young project. I wouldn't want to be misleading with the badge on my README
References
https://pepy.tech/project/isitfit
https://pypistats.org/packages/isitfit