Closed asleepysheepy closed 6 years ago
When running RAILS_ENV=production bundle exec rake hyrax:stats:user_stats
on one of the production web servers, the following error is returned.
I, [2017-11-21T13:28:42.290122 #15720] INFO -- : Hyrax::UserStatImporter: Begin import of User stats.
I, [2017-11-21T13:28:44.705830 #15720] INFO -- : ActiveFedora: loading fedora config from /srv/apps/curate_uc/config/fedora.yml
I, [2017-11-21T13:28:44.706920 #15720] INFO -- : ActiveFedora: loading solr config from /srv/apps/curate_uc/config/solr.yml
I, [2017-11-21T13:28:44.763000 #15720] INFO -- : Attempted to init base path `prod`, but it already exists
I, [2017-11-21T13:28:45.516142 #15720] INFO -- : Hyrax::UserStatImporter: Retried FileViewStat on #<struct Struct::UserRecord id=2, user_key="abelnp@ucmail.uc.edu", last_stats_update=Tue, 31 Oct 2017 00:00:00 +0000> for file bc386m407 too many times.
I, [2017-11-21T13:28:45.520263 #15720] INFO -- : Hyrax::UserStatImporter: Last exception undefined method `hyrax__pageview' for #<Legato::Management::Profile:0x0056512efcc120>
rake aborted!
NoMethodError: undefined method `each' for true:TrueClass
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:120:in `tally_results'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:53:in `block in process_files'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:50:in `each'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:50:in `process_files'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:30:in `block in import'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:23:in `each'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:23:in `import'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/lib/tasks/stats_tasks.rake:6:in `block (3 levels) in <top (required)>'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/rake-12.2.1/exe/rake:27:in `<top (required)>'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:75:in `load'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:75:in `kernel_load'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:28:in `run'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli.rb:424:in `exec'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/invocation.rb:126:in `invoke_command'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor.rb:387:in `dispatch'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli.rb:27:in `dispatch'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/base.rb:466:in `start'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli.rb:18:in `start'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/exe/bundle:30:in `block in <top (required)>'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/friendly_errors.rb:122:in `with_friendly_errors'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/exe/bundle:22:in `<top (required)>'
/srv/apps/.gem/ruby/2.4.0/bin/bundle:23:in `load'
/srv/apps/.gem/ruby/2.4.0/bin/bundle:23:in `<main>'
Tasks: TOP => hyrax:stats:user_stats
(See full trace by running task with --trace)
https://github.com/samvera/hyrax/wiki/Analytics-workaround-for-non-production-environments The error mentioned in this appears to be the same error we are seeing now undefined method
hyrax__pageview' for #
@flutterflies I think we do need the config/initializers/legato.rb file you found on that wiki page. Eager loading isn't pulling in what we need for some reason. Please go ahead and open a PR that creates the legato initializer. Only change to the file is that you should use if Hyrax.config.analytics?
instead of if Hyrax.config.analytics
I ran the rake task again on production making sure the needed code was loaded. It ran for a while and eventually exited with the error below. However, I think it partially worked because I'm seeing stats on some user pages like http://localhost:8001/users/baldwitw@ucmail-dot-uc-dot-edu?locale=en (I'm assuming those numbers were zero before the task ran)
The error (after 20 minutes or so of running) seems to have happened while processing user cechlibsvc@ucmail.uc.edu
:
I, [2017-11-28T15:08:29.096422 #1693] INFO -- : Hyrax::UserStatImporter: Retried FileViewStat on #<struct Struct::UserRecord id=25, user_key="cechlibsvc@ucmail.uc.edu", last_stats_update=Tue, 31 Oct 2017 00:00:00 +0000> for file bc387599d too many times.
I, [2017-11-28T15:08:29.097351 #1693] INFO -- : Hyrax::UserStatImporter: Last exception {"errors"=>[{"domain"=>"global", "reason"=>"internalError", "message"=>"There was an internal error."}], "code"=>500, "message"=>"There was an internal error."}:
{"error":{"errors":[{"domain":"global","reason":"internalError","message":"There was an internal error."}],"code":500,"message":"There was an internal error."}}
rake aborted!
NoMethodError: undefined method `each' for true:TrueClass
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:120:in `tally_results'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:53:in `block in process_files'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:50:in `each'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:50:in `process_files'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:30:in `block in import'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:23:in `each'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/app/services/hyrax/user_stat_importer.rb:23:in `import'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/hyrax-1.0.5/lib/tasks/stats_tasks.rake:6:in `block (3 levels) in <top (required)>'
/srv/apps/curate_uc/vendor/bundle/ruby/2.4.0/gems/rake-12.2.1/exe/rake:27:in `<top (required)>'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:75:in `load'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:75:in `kernel_load'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli/exec.rb:28:in `run'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli.rb:424:in `exec'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/invocation.rb:126:in `invoke_command'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor.rb:387:in `dispatch'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli.rb:27:in `dispatch'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/vendor/thor/lib/thor/base.rb:466:in `start'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/cli.rb:18:in `start'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/exe/bundle:30:in `block in <top (required)>'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/lib/bundler/friendly_errors.rb:122:in `with_friendly_errors'
/srv/apps/.gem/ruby/2.4.0/gems/bundler-1.16.0/exe/bundle:22:in `<top (required)>'
/srv/apps/.gem/ruby/2.4.0/bin/bundle:23:in `load'
/srv/apps/.gem/ruby/2.4.0/bin/bundle:23:in `<main>'
Tasks: TOP => hyrax:stats:user_stats
(See full trace by running task with --trace)
Next steps is to test on QA by making a new tracker.
Have verified that this is failing because the rescue_and_retry
method is returning a boolean:
https://github.com/samvera/hyrax/blob/1-0-stable/app/services/hyrax/user_stat_importer.rb#L74-L88
Then, the code using the stat is only checking for blankness: https://github.com/samvera/hyrax/blob/1-0-stable/app/services/hyrax/user_stat_importer.rb#L53
The code in PR #1794 does get us past the error we were seeing. However, there are a few other issues with the statistics code:
These lines are needed: https://github.com/uclibs/scholar_uc/pull/1794#issuecomment-353616434
Google was starting to reject our connections because we were hitting the API too frequently.
We should have been able to set a delay with Hyrax::UserStatImporter.new(verbose: true, logging: true, delay_secs: 1)
but it wasn't working for some reason. We had to hard code the delay in the code and it then worked.
The rake task eventually errored with Work ID: w9505044z rake aborted! Ldp::Gone:
. This is happening because the stats code is trying to access a work that has been deleted from Fedora. We need to figure out why it's still trying to access a work that's no longer there.
The statistics code in Hyrax has bugs. In app/services/hyrax/user_stat_importer.rb
the create_or_update_user_stats
always gets passed an empty hash for the stats
variable. It looks like the following commit back in Sufia broke things when the process_files
and process_works
methods were introduced: https://github.com/samvera/sufia/commit/51884765704d946ec6ec0795aa836e79792e5e7c
The next Hyrax 2 sprint involves Analytics. Since the architecture is changing, I think this needs to be punted to after the hydrax 2 update. https://github.com/samvera/hyrax/milestone/15
See Notes on : https://github.com/uclibs/scholar_uc/pull/1794
Scheduled incremental test successful on QA.
@scherztc FYI, this was the output of the cron job on QA last night:
I, [2018-02-15T01:00:14.424872 #14440] INFO -- : Hyrax::UserStatImporter: Begin import of User stats.
I, [2018-02-15T01:00:14.544115 #14440] INFO -- : ActiveFedora: loading fedora config from /srv/apps/curate_uc/config/fedora.yml
I, [2018-02-15T01:00:14.544658 #14440] INFO -- : ActiveFedora: loading solr config from /srv/apps/curate_uc/config/solr.yml
I, [2018-02-15T01:00:14.707046 #14440] INFO -- : Attempted to init base path `prod`, but it already exists
I, [2018-02-15T01:00:29.899956 #14440] INFO -- : Hyrax::UserStatImporter: User stats import complete.
I, [2018-02-15T01:00:29.900047 #14440] INFO -- : Hyrax::UserStatImporter: Begin import of User stats.
I, [2018-02-15T01:00:29.900057 #14440] INFO -- : Hyrax::UserStatImporter: Begin import of User stats.
I, [2018-02-15T01:00:30.057574 #14440] INFO -- : Hyrax::UserStatImporter: User stats import complete.
I, [2018-02-15T01:00:30.057591 #14440] INFO -- : Hyrax::UserStatImporter: User stats import complete.
So no errors or warnings that I can see.
The rake is running and populating statistics. We are hitting a quota limit with google api. We are going to keep an eye on the status.
I, [2018-03-02T12:04:03.525179 #15644] INFO -- : Hyrax::UserStatImporter: Last exception {"errors"=>[{"domain"=>"global", "reason"=>"dailyLimitExceeded", "message"=>"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}], "code"=>403, "message"=>"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}: {"error":{"errors":[{"domain":"global","reason":"dailyLimitExceeded","message":"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}],"code":403,"message":"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}}
The Google Analytics API quota for a view is 10000 requests per day.
Hyrax::UserStatImporter: Retried WorkViewStat on #<struct Struct::UserRecord id=129, user_key="wallroj@ucmail.uc.edu", last_stats_update=Sun, 18 Mar 2018 00:00:00 UTC +00:00> for work bc389032b too many times. I, [2018-03-20T10:28:58.262764 #26277] INFO -- : Hyrax::UserStatImporter: Last exception {"errors"=>[{"domain"=>"global", "reason"=>"dailyLimitExceeded", "message"=>"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}], "code"=>403, "message"=>"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}: {"error":{"errors":[{"domain":"global","reason":"dailyLimitExceeded","message":"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}],"code":403,"message":"Quota Error: profileId ga:92259381 has exceeded the daily request limit."}}
Here is Penn State's refactor of the User Stats Importer: https://github.com/psu-stewardship/scholarsphere/commits/develop/app/services/user_stats_importer.rb
Do we want to try this since Analytics is getting an overhaul in 2.x?
We are gathering analytics on works and files. Users can see their view analytics from the work or file show page. The tally's are still incomplete. I think we should move on, since these tallys don't exist in the Dashboard in Hyrax 2.
Stack Overflow article about GA : https://stackoverflow.com/questions/23919028/google-analytics-daily-limit-exceeded-quota-error-register-application-prope
Further evidence that this Dashboard Tally feature is not supported in Hyrax 2: https://github.com/samvera/hyrax/wiki/Hyrax-Management-Guide#populating-the-analytics-db
For Google Analytics views and download stats to work we need to run a cron job on the server nightly. We need to make sure the cron job is set and then verify that is it working as expected
command for the cron job: