Open da2x opened 10 years ago
Daniel, if I understand correctly, a feed provider will retrieve your feed only once using the provider IP (i.e., NewsBlur) and it will report the total subscriber count as part of the user-agent? Are there any exceptions to this? Can a feed provider fetch the feed twice or more?
Not quite. From what I see in my own logs, the user-agents are the same (includes the same subscription number) but they fetch from different IP-addresses pretty much every time. The way these services work is that they fetch popular feeds more often (like every five minute) and less popular feeds less often (every 9 hours).
I think this logic would work: Look in all user-agents for " subscribers" or " readers". Match the int in front of those matched strings. Exclude the int from User-Agent. Drop every user-agent matching this new int-free user-agent and only count it once. Use the matched int instead for the unique count.
Pitfalls: The number of subscribers can grow through a day.
Damn. Looks like it has to be a hard-coded list. Found out that at least NewsBlur uses three different-purpose User-Agents which all report the subscriber numbers. “NewBlur Page Fetcher”, “NewsBlur Feed Fetcher”, and “NewsBlur Favicon Fetcher”. Only the one called “Feed Fetcher” should be reported as the subscription number (that is the one with the most frequent number of requests for the rss feed).
NewsBlur Page Fetcher - 220 subscribers - http://www.newsblur.com/site/5241507/aeyoun (Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1) AppleWebKit/534.48.3 (KHTML, like Gecko) Version/5.1 Safari/534.48.3)
NewsBlur Feed Fetcher - 220 subscribers - http://www.newsblur.com/site/5241507/aeyoun (Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1) AppleWebKit/534.48.3 (KHTML, like Gecko) Version/5.1 Safari/534.48.3)
NewsBlur Favicon Fetcher - 219 subscribers - http://www.newsblur.com/site/5241507/aeyoun (Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1) AppleWebKit/534.48.3 (KHTML, like Gecko) Version/5.1 Safari/534.48.3)
The below User-Agents samples is currently counted as one unique visitor. However, their unique User-Agents should be counted as one multiplied by number of
subscribers
. (Most visit from different IP-addresses but showing the same number of subscribers, risk of over-counting.) Sample implementation.It would possibly make sense to do something more interesting with feed subscriptions as well.