pod4lib / aggregator

POD Aggregator, f.k.a. the POD Data Lake
https://pod.stanford.edu
Apache License 2.0
9 stars 3 forks source link

Collect and report download statistics / analytics #196

Open anarchivist opened 3 years ago

anarchivist commented 3 years ago

Given the following assumptions:

We need a way to collect basic download statistics and analytics:

Breaking out downloads by token is probably only useful if we include some additional descriptive info for tokens (#197).

cbeer commented 3 years ago

We're currently collecting this information using ahoy:

001 > e = Ahoy::Event.last
#<Ahoy::Event id: 11, visit_id: 8363, user_id: 11, name: "Download", properties: {"attachment_id"=>3175, "attachment_name"=>"files", "byte_size"=>3073505837, "content_type"=>"application/zip", "filename"=>"penn_02DEC2020.zip", "organization_id"=>"penn"}, time: "2020-12-02 14:46:11"> 

002 > e.visit
#<Ahoy::Visit id: 8363, visit_token: "753b6b43-3ba1-48d2-b5f8-e35ef568ca5d", visitor_token: "391dd96f-d339-4721-b0f2-b2365371240f", user_id: 11, ip: "100.11.141.0", user_agent: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) Ap...", referrer: "https://pod.stanford.edu/site_users", referring_domain: "pod.stanford.edu", landing_page: "https://pod.stanford.edu/organizations", browser: "Chrome", os: "Mac", device_type: "Desktop", country: nil, region: nil, city: nil, latitude: nil, longitude: nil, utm_source: nil, utm_medium: nil, utm_term: nil, utm_content: nil, utm_campaign: nil, app_version: nil, os_version: nil, platform: nil, started_at: "2020-12-02 14:45:17", token_id: nil, organization_id: "penn"> 

The Ahoy::Event has information about what file was downloaded, the org the file belonged to, and some stats about the file. The associated visit has the user, token, and relevant organization of who downloaded the data.