uclibs / scholar_uc_legacy

Source code for Scholar@UC up to version 3.x. Replaced by ucrate
Other
5 stars 1 forks source link

Verify Google Analytics Cron job is working on Prod #1767

Closed asleepysheepy closed 6 years ago

asleepysheepy commented 6 years ago

For Google Analytics views and download stats to work we need to run a cron job on the server nightly. We need to make sure the cron job is set and then verify that is it working as expected

command for the cron job:

rake hyrax:stats:user_stats
hortongn commented 6 years ago

When running RAILS_ENV=production bundle exec rake hyrax:stats:user_stats on one of the production web servers, the following error is returned.

I, [2017-11-21T13:28:42.290122 #15720]  INFO -- : Hyrax::UserStatImporter: Begin import of User stats.
I, [2017-11-21T13:28:44.705830 #15720]  INFO -- : ActiveFedora: loading fedora config from /srv/apps/curate_uc/config/fedora.yml
I, [2017-11-21T13:28:44.706920 #15720]  INFO -- : ActiveFedora: loading solr config from /srv/apps/curate_uc/config/solr.yml
I, [2017-11-21T13:28:44.763000 #15720]  INFO -- : Attempted to init base path `prod`, but it already exists
I, [2017-11-21T13:28:45.516142 #15720]  INFO -- : Hyrax::UserStatImporter: Retried FileViewStat on #<struct Struct::UserRecord id=2, user_key="abelnp@ucmail.uc.edu", last_stats_update=Tue, 31 Oct 2017 00:00:00 +0000> for file bc386m407 too many times.
I, [2017-11-21T13:28:45.520263 #15720]  INFO -- : Hyrax::UserStatImporter: Last exception undefined method `hyrax__pageview' for #<Legato::Management::Profile:0x0056512efcc120>
rake aborted!
NoMethodError: undefined method `each' for true:TrueClass
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:120:in `tally_results'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:53:in `block in process_files'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:50:in `each'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:50:in `process_files'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:30:in `block in import'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:23:in `each'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:23:in `import'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/lib/tasks/stats_tasks.rake:6:in `block (3 levels) in <top (required)>'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/rake-12.2.1/exe/rake:27:in `<top (required)>'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:75:in `load'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:75:in `kernel_load'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:28:in `run'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli.rb:424:in `exec'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/invocation.rb:126:in `invoke_command'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor.rb:387:in `dispatch'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli.rb:27:in `dispatch'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/base.rb:466:in `start'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli.rb:18:in `start'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/exe/bundle:30:in `block in <top (required)>'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/friendly_errors.rb:122:in `with_friendly_errors'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/exe/bundle:22:in `<top (required)>'
/srv/apps/.gem/ruby/2.4.0/bin/bundle:23:in `load'
/srv/apps/.gem/ruby/2.4.0/bin/bundle:23:in `<main>'
Tasks: TOP => hyrax:stats:user_stats
(See full trace by running task with --trace)
hortongn commented 6 years ago

See https://github.com/samvera/sufia/issues/3090 and https://github.com/psu-stewardship/scholarsphere/issues/929

asleepysheepy commented 6 years ago

https://github.com/samvera/hyrax/wiki/Analytics-workaround-for-non-production-environments The error mentioned in this appears to be the same error we are seeing now undefined methodhyrax__pageview' for #`

hortongn commented 6 years ago

@flutterflies I think we do need the config/initializers/legato.rb file you found on that wiki page. Eager loading isn't pulling in what we need for some reason. Please go ahead and open a PR that creates the legato initializer. Only change to the file is that you should use if Hyrax.config.analytics? instead of if Hyrax.config.analytics

I ran the rake task again on production making sure the needed code was loaded. It ran for a while and eventually exited with the error below. However, I think it partially worked because I'm seeing stats on some user pages like http://localhost:8001/users/baldwitw@ucmail-dot-uc-dot-edu?locale=en (I'm assuming those numbers were zero before the task ran)

The error (after 20 minutes or so of running) seems to have happened while processing user cechlibsvc@ucmail.uc.edu:

I, [2017-11-28T15:08:29.096422 #1693]  INFO -- : Hyrax::UserStatImporter: Retried FileViewStat on #<struct Struct::UserRecord id=25, user_key="cechlibsvc@ucmail.uc.edu", last_stats_update=Tue, 31 Oct 2017 00:00:00 +0000> for file bc387599d too many times.
I, [2017-11-28T15:08:29.097351 #1693]  INFO -- : Hyrax::UserStatImporter: Last exception {"errors"=>[{"domain"=>"global", "reason"=>"internalError", "message"=>"There was an internal error."}], "code"=>500, "message"=>"There was an internal error."}:
{"error":{"errors":[{"domain":"global","reason":"internalError","message":"There was an internal error."}],"code":500,"message":"There was an internal error."}}
rake aborted!
NoMethodError: undefined method `each' for true:TrueClass
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:120:in `tally_results'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:53:in `block in process_files'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:50:in `each'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:50:in `process_files'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:30:in `block in import'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:23:in `each'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:23:in `import'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/lib/tasks/stats_tasks.rake:6:in `block (3 levels) in <top (required)>'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/rake-12.2.1/exe/rake:27:in `<top (required)>'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:75:in `load'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:75:in `kernel_load'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:28:in `run'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli.rb:424:in `exec'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/invocation.rb:126:in `invoke_command'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor.rb:387:in `dispatch'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli.rb:27:in `dispatch'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/base.rb:466:in `start'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli.rb:18:in `start'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/exe/bundle:30:in `block in <top (required)>'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/friendly_errors.rb:122:in `with_friendly_errors'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/exe/bundle:22:in `<top (required)>'
/srv/apps/.gem/ruby/2.4.0/bin/bundle:23:in `load'
/srv/apps/.gem/ruby/2.4.0/bin/bundle:23:in `<main>'
Tasks: TOP => hyrax:stats:user_stats
(See full trace by running task with --trace)
jamesvanmil commented 6 years ago

Next steps is to test on QA by making a new tracker.

jamesvanmil commented 6 years ago

Have verified that this is failing because the rescue_and_retry method is returning a boolean:

https://github.com/samvera/hyrax/blob/1-0-stable/app/services/hyrax/user_stat_importer.rb#L74-L88

Then, the code using the stat is only checking for blankness: https://github.com/samvera/hyrax/blob/1-0-stable/app/services/hyrax/user_stat_importer.rb#L53

hortongn commented 6 years ago

The code in PR #1794 does get us past the error we were seeing. However, there are a few other issues with the statistics code:

  1. These lines are needed: https://github.com/uclibs/scholar_uc/pull/1794#issuecomment-353616434

  2. Google was starting to reject our connections because we were hitting the API too frequently. We should have been able to set a delay with Hyrax::UserStatImporter.new(verbose: true, logging: true, delay_secs: 1) but it wasn't working for some reason. We had to hard code the delay in the code and it then worked.

  3. The rake task eventually errored with Work ID: w9505044z rake aborted! Ldp::Gone:. This is happening because the stats code is trying to access a work that has been deleted from Fedora. We need to figure out why it's still trying to access a work that's no longer there.

  4. The statistics code in Hyrax has bugs. In app/services/hyrax/user_stat_importer.rb the create_or_update_user_stats always gets passed an empty hash for the stats variable. It looks like the following commit back in Sufia broke things when the process_files and process_works methods were introduced: https://github.com/samvera/sufia/commit/51884765704d946ec6ec0795aa836e79792e5e7c

scherztc commented 6 years ago

The next Hyrax 2 sprint involves Analytics. Since the architecture is changing, I think this needs to be punted to after the hydrax 2 update. https://github.com/samvera/hyrax/milestone/15

scherztc commented 6 years ago

See Notes on : https://github.com/uclibs/scholar_uc/pull/1794

scherztc commented 6 years ago

Scheduled incremental test successful on QA.

screen shot 2018-02-15 at 10 53 09 am
hortongn commented 6 years ago

@scherztc FYI, this was the output of the cron job on QA last night:

I, [2018-02-15T01:00:14.424872 #14440]  INFO -- : Hyrax::UserStatImporter: Begin import of User stats.
I, [2018-02-15T01:00:14.544115 #14440]  INFO -- : ActiveFedora: loading fedora config from /srv/apps/curate_uc/config/fedora.yml
I, [2018-02-15T01:00:14.544658 #14440]  INFO -- : ActiveFedora: loading solr config from /srv/apps/curate_uc/config/solr.yml
I, [2018-02-15T01:00:14.707046 #14440]  INFO -- : Attempted to init base path `prod`, but it already exists
I, [2018-02-15T01:00:29.899956 #14440]  INFO -- : Hyrax::UserStatImporter: User stats import complete.
I, [2018-02-15T01:00:29.900047 #14440]  INFO -- : Hyrax::UserStatImporter: Begin import of User stats.
I, [2018-02-15T01:00:29.900057 #14440]  INFO -- : Hyrax::UserStatImporter: Begin import of User stats.
I, [2018-02-15T01:00:30.057574 #14440]  INFO -- : Hyrax::UserStatImporter: User stats import complete.
I, [2018-02-15T01:00:30.057591 #14440]  INFO -- : Hyrax::UserStatImporter: User stats import complete.

So no errors or warnings that I can see.

scherztc commented 6 years ago

The rake is running and populating statistics. We are hitting a quota limit with google api. We are going to keep an eye on the status.

I, [2018-03-02T12:04:03.525179 #15644] INFO -- : Hyrax::UserStatImporter: Last exception {"errors"=>[{"domain"=>"global", "reason"=>"dailyLimitExceeded", "message"=>"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}], "code"=>403, "message"=>"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}: {"error":{"errors":[{"domain":"global","reason":"dailyLimitExceeded","message":"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}],"code":403,"message":"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}}

scherztc commented 6 years ago

The Google Analytics API quota for a view is 10000 requests per day.

scherztc commented 6 years ago

Hyrax::UserStatImporter: Retried WorkViewStat on #<struct Struct::UserRecord id=129, user_key="wallroj@ucmail.uc.edu", last_stats_update=Sun, 18 Mar 2018 00:00:00 UTC +00:00> for work bc389032b too many times. I, [2018-03-20T10:28:58.262764 #26277] INFO -- : Hyrax::UserStatImporter: Last exception {"errors"=>[{"domain"=>"global", "reason"=>"dailyLimitExceeded", "message"=>"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}], "code"=>403, "message"=>"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}: {"error":{"errors":[{"domain":"global","reason":"dailyLimitExceeded","message":"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}],"code":403,"message":"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}}

scherztc commented 6 years ago

Here is Penn State's refactor of the User Stats Importer: https://github.com/psu-stewardship/scholarsphere/commits/develop/app/services/user_stats_importer.rb

Do we want to try this since Analytics is getting an overhaul in 2.x?

scherztc commented 6 years ago

We are gathering analytics on works and files. Users can see their view analytics from the work or file show page. The tally's are still incomplete. I think we should move on, since these tallys don't exist in the Dashboard in Hyrax 2.

scherztc commented 6 years ago

Stack Overflow article about GA : https://stackoverflow.com/questions/23919028/google-analytics-daily-limit-exceeded-quota-error-register-application-prope

scherztc commented 6 years ago

Further evidence that this Dashboard Tally feature is not supported in Hyrax 2: https://github.com/samvera/hyrax/wiki/Hyrax-Management-Guide#populating-the-analytics-db