psu-libraries / scholarsphere-3

A web application for ingest, curation, search, and display of digital assets. Powered by Hydra technologies (Rails, Hydra-head, Blacklight, Solr, Fedora Commons, etc.)
Apache License 2.0
78 stars 24 forks source link

Fileset stats are failing to display #1020

Closed awead closed 7 years ago

awead commented 7 years ago

When navigating to a fileset's analytics page, we're getting this error:

2017-08-10 10:39:49 -0400 (27354) Rendering 500 page due to exception: #<OAuth2::Error: {"errors"=>[{"domain"=>"global", "reason"=>"dailyLimitExceed
ed", "message"=>"Quota Error: profileId ga:61852754 has exceeded the daily request limit."}], "code"=>403, "message"=>"Quota Error: profileId ga:618
52754 has exceeded the daily request limit."}:
{"error":{"errors":[{"domain":"global","reason":"dailyLimitExceeded","message":"Quota Error: profileId ga:61852754 has exceeded the daily request li
mit."}],"code":403,"message":"Quota Error: profileId ga:61852754 has exceeded the daily request limit."}}>

While this has been documented in #929, we expect that the page should still render any available download and view stats.

awead commented 7 years ago

Sufia is checking for the date of our last cached stats:

https://github.com/samvera/sufia/blob/7.2-migration/app/models/sufia/statistic.rb#L50

This is coming up as Jun 1, which is probably the last time the stats were successfully run before we migrated. Because this date is earlier than the current date, Sufia will go out to GA to get updated stats. For that, it's invoking #ga_statistics which includes a call to Sufia::Analytics.profile. As soon as we ask for the profile, OAuth returns the error.

A better solution would be wrap that in a rescue instead of an unless here:

https://github.com/samvera/sufia/blob/7.2-migration/app/models/sufia/statistic.rb#L32

and here:

https://github.com/samvera/sufia/blob/7.2-migration/app/models/file_download_stat.rb#L10

awead commented 7 years ago

FYI: @DanCoughlin @cam156 @olendorf

olendorf commented 7 years ago

@awead Why isn't the cache being updated? or is that a stupid question?

awead commented 7 years ago

@olendorf No, not a stupid question! See #929. For some reason, GA is telling us that we've exceeded our quota limit, but when we look at our account, we haven't. @cam156 has been looking into this and as yet, hasn't found a solution, or a reason, for the error. We had thought that even with the error, we could get by for a little while. There was still the potential for some stats to get updated, and you can still see view stats for works. However, this new issue with trying to view downloads of files is potentially more pressing since the file set analytics page isn't rendering at all now.

carolyncole commented 7 years ago

I put in a bug report for the quota error: https://issuetracker.google.com/issues/65003406 Let's hope google will respond.

carolyncole commented 7 years ago

I got a response and we have only 10,000 accesses to the reporting API per day. This means that we can not send one query per FileSet, and per work per day. I asked for an update to the quota again through the ticket, but I am not sure if we will get that or not.

Browsing some of the documentation sent we may want to change how we are requesting the data to a more general query and parse that for the individual data. We can get a report for the site like: https://www.googleapis.com/analytics/v3/data/ga?ids=ga%3A61852754&start-date=2017-08-24&end-date=yesterday&metrics=ga%3Apageviews&dimensions=ga%3ApagePath&sort=-ga%3ApagePath&segment=gaid%3A%3A-1 as created by https://ga-dev-tools.appspot.com/query-explorer/ This gives us information on the site usage for the last day. We could then parse the site usage to fill in downloads and views for works and FIleSets

carolyncole commented 7 years ago

Code to get all the page views and downloads from the system over the last 10 days

query = Sufia::Pageview.results(profile, start_date: 10.days.ago, end_date: 1.day.ago)
query.dimensions << :pagePath
veiw_results = query.each {|item| puts item}

download_query = Sufia::Download.results(profile,start_date: 10.days.ago, end_date: 1.day.ago)
download_query.dimensions << :pagePath
download_query.each {|item| puts item}

Code to find out which views have not been marked in the database

concerns_results = veiw_results.select {|result| result.pagePath.include? 'concern'}
work_results = concerns_results.select {|result| result.pagePath.include? 'generic_works'}
missing_views = work_results.reject |work_result| do
  results = WorkViewStat.where work_id: work_result.pagePath.split('/').last, date: Date.parse(work_result.date)
  results.count == 1 && results[0].work_views == work_result.pageviews.to_i
end

Code to find out which downloads have not been marked in the database

file_set_results = concerns_results.select {|result| result.pagePath.include? 'file_sets'}
missingi_file_views = file_set_results.reject {|file_result|  results = FileViewStat.where file_id: file_result.pagePath.split('/').last, date: Date.parse(file_result.date); results.count == 1 && results[0].views == file_result.pageviews.to_i}

Code to find out if the download was a work or a fileset (work downloads the first file set??)

download_query.each {|result| puts ActiveFedora::Base.find(result.pagePath.split('/').last).class}
carolyncole commented 7 years ago

We should capture the error in the application controller and output the appropriate error message.

carolyncole commented 7 years ago

We are currently capturing the error and throwing up a generic error page: screen shot 2017-09-11 at 1 15 55 pm

Is this good enough until we deal with #950 in the 3.2 milestone?

awead commented 7 years ago

@cam156 that's probably fine. If we wanted to, we could add a custom message to

https://github.com/psu-stewardship/scholarsphere/blob/develop/config/locales/en.yml#L82

Unless we don't want to show that sort of information to the user, it could be helpful to us in the future.