Mozilla/5.0 (iPhone; CPU iPhone OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
The top user agents that have an "empty string" category but are clearly not NuGet client implementations in the past 7 days are:
UserAgent
DownloadCount
(unknown)
700495
Veracode
168417
NuGetTestModeEnabled
100279
NuGetMirror/4.4.0
86027
Knapcode.ExplorePackages.Bot/4.7.0
69128
Go-http-client/2.0
63458
okhttp/3.9.0
62224
EdgeAccel/2.0
28524
Mozilla/5.0 NuGet
14852
Python-urllib/2.7
9197
Clearly some of these are scripts that shouldn't be included in the download counts. We should implement a way for a certain pattern of user agents (perhaps including the substring (bot)) to be excluded from the download counts. We should also document this approach in the API docs and recommend that users specify a user agent (RFC should 😄?).
We'll need to update the user agent parser to look for this pattern. Today, unexpected/custom user agents are given the client name (unknown) and the empty string client category.
Today, certain downloads are not counted in the download count reports, namely certain bots and crawlers.
The top unknown/crawler user agents in the past 7 days are:
Note that there is a bug in the roll-up that causes these to end up being counted after 42 days anyways: https://github.com/NuGet/NuGetGallery/issues/6552, but that's beside the point.
The top user agents that have an "empty string" category but are clearly not NuGet client implementations in the past 7 days are:
Clearly some of these are scripts that shouldn't be included in the download counts. We should implement a way for a certain pattern of user agents (perhaps including the substring
(bot)
) to be excluded from the download counts. We should also document this approach in the API docs and recommend that users specify a user agent (RFC should 😄?).We'll need to update the user agent parser to look for this pattern. Today, unexpected/custom user agents are given the client name
(unknown)
and the empty string client category.