ocaml-opam / opam2web

A tool to generate a website from an opam repository
https://opam.ocaml.org
Other
56 stars 28 forks source link

download statistics inflated by travis #79

Open ygrek opened 10 years ago

ygrek commented 10 years ago

Is it possible to account for downloads from travis CS and substract from download stats?

avsm commented 10 years ago

Is it really worth the trouble? That number will also be inflated by web crawlers. If the downloads are that low, the number is well within the error margins...

On 27 Nov 2013, at 02:14, ygrek notifications@github.com wrote:

Is it possible to account for downloads from travis CS and substract from download stats?

— Reply to this email directly or view it on GitHub.

ygrek commented 10 years ago

The crawlers will not usually download random archives and the majority of them can be diverted with robots.txt. Considering that every build of every reverse dependency on travis will download the package several times.. I guess it can substantially inflate the number..

AltGr commented 10 years ago

if we have a list of travis IPs, it should be quite straight-forward to filter them out during the log parsing.

ygrek commented 10 years ago

maybe they use some specific user-agent?

avsm commented 10 years ago

This is a total waste of time. Travis makes about 6 requests per-pull request and doesn't do so regularly at all. Is there any evidence of this inflation other than that?

On 29 Nov 2013, at 15:56, ygrek notifications@github.com wrote:

maybe they use some specific user-agent?

— Reply to this email directly or view it on GitHub.

ygrek commented 10 years ago

I don't know really. I could try to investigate with the logs. Was just looking at extlib download numbers (for 1.6.0) and seen them grow rapidly in the first day after release, quite unexpected, trying to find the explanation.