ncbo / ncbo_cron

Jobs that run on a regular basis in the NCBO infrastructure
Other
2 stars 6 forks source link

Google Analytics generation fails randomly #44

Closed mdorf closed 12 months ago

mdorf commented 3 years ago

When running Google Analytics regeneration script, at random times and places the Google API throw these two errors. There doesn't seem to be any discernible pattern to the errors. The execution succeeds some of the times and fails the other times.

I've tried throttling the request to Google APIs, but that had no impact on the occurrence of these errors.

E, [2021-07-08T16:55:41.583970 #58704] ERROR -- : Failed rebuilding ontology analytics repository with exception: Google::Apis::TransmissionError: execution expired
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/2.6.0/openssl/buffering.rb:57:in `sysread'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/2.6.0/openssl/buffering.rb:57:in `fill_rbuff'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/2.6.0/openssl/buffering.rb:207:in `gets'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/httpclient-2.8.3/lib/httpclient/ssl_socket.rb:62:in `gets'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/httpclient-2.8.3/lib/httpclient/session.rb:805:in `block in parse_header'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/2.6.0/timeout.rb:103:in `timeout'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/httpclient-2.8.3/lib/httpclient/session.rb:801:in `parse_header'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/httpclient-2.8.3/lib/httpclient/session.rb:784:in `read_header'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/httpclient-2.8.3/lib/httpclient/session.rb:561:in `get_header'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/httpclient-2.8.3/lib/httpclient.rb:1299:in `do_get_header'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/httpclient-2.8.3/lib/httpclient.rb:1245:in `do_get_block'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/httpclient-2.8.3/lib/httpclient.rb:1019:in `block in do_request'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/httpclient-2.8.3/lib/httpclient.rb:1133:in `protect_keep_alive_disconnected'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/httpclient-2.8.3/lib/httpclient.rb:1014:in `do_request'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/httpclient-2.8.3/lib/httpclient.rb:856:in `request'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_client_adapter.rb:17:in `block in call'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/hurley-0.2/lib/hurley/client.rb:252:in `initialize'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_client_adapter.rb:16:in `new'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_client_adapter.rb:16:in `call'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/hurley-0.2/lib/hurley/client.rb:122:in `call_with_redirects'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/hurley-0.2/lib/hurley/client.rb:89:in `call'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/hurley-0.2/lib/hurley/client.rb:47:in `get'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_command.rb:272:in `execute_once'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_command.rb:107:in `block (2 levels) in execute'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/retriable-3.1.2/lib/retriable.rb:61:in `block in retriable'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/retriable-3.1.2/lib/retriable.rb:56:in `times'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/retriable-3.1.2/lib/retriable.rb:56:in `retriable'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_command.rb:104:in `block in execute'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/retriable-3.1.2/lib/retriable.rb:61:in `block in retriable'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/retriable-3.1.2/lib/retriable.rb:56:in `times'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/retriable-3.1.2/lib/retriable.rb:56:in `retriable'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_command.rb:96:in `execute'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/base_service.rb:353:in `execute_or_queue_command'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/generated/google/apis/analytics_v3/service.rb:133:in `get_ga_data'
/Users/mdorf/dev/ncbo/ncbo_cron/lib/ncbo_cron/ontology_analytics.rb:37:in `block (2 levels) in fetch_ontology_analytics'

AND

E, [2021-07-08T20:11:10.009721 #61367] ERROR -- : Failed rebuilding ontology analytics repository with exception: Google::Apis::ServerError: Server error
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_command.rb:214:in `check_status'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/api_command.rb:110:in `check_status'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_command.rb:179:in `process_response'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_command.rb:286:in `execute_once'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_command.rb:107:in `block (2 levels) in execute'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/retriable-3.1.2/lib/retriable.rb:61:in `block in retriable'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/retriable-3.1.2/lib/retriable.rb:56:in `times'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/retriable-3.1.2/lib/retriable.rb:56:in `retriable'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_command.rb:104:in `block in execute'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/retriable-3.1.2/lib/retriable.rb:61:in `block in retriable'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/retriable-3.1.2/lib/retriable.rb:56:in `times'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/retriable-3.1.2/lib/retriable.rb:56:in `retriable'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/http_command.rb:96:in `execute'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/lib/google/apis/core/base_service.rb:353:in `execute_or_queue_command'
/Users/mdorf/.rbenv/versions/2.6.6/lib/ruby/gems/2.6.0/gems/google-api-client-0.10.3/generated/google/apis/analytics_v3/service.rb:133:in `get_ga_data'
/Users/mdorf/dev/ncbo/ncbo_cron/lib/ncbo_cron/ontology_analytics.rb:52:in `block (2 levels) in fetch_ontology_analytics'
jvendetti commented 3 years ago

Adding some context here. When the analytics generation fails, one side effect is that the "ontology visits" chart on the BioPortal home page appears empty. The latest occurrence of this was noticed today:

Screen Shot 2021-09-02 at 3 00 58 PM

I checked the /srv/ncbo/ncbo_cron/logs/scheduler-ontology-analytics.log file on the production parsing box. There are no errors, but the Aug. 30th run shows very little output:

I, [2021-08-30T00:15:03.651349 #23572]  INFO -- : Acronym: ICO, Results: 37, Start Index: 10001
I, [2021-08-30T00:15:10.242278 #23572]  INFO -- : Acronym: DRPSNPTO, Results: 1, Start Index: 10001
I, [2021-08-30T00:15:15.820403 #23572]  INFO -- : Acronym: GEOSPECIES, Results: 40, Start Index: 10001
I, [2021-08-30T00:15:27.311785 #23572]  INFO -- : Acronym: TEO, Results: 64, Start Index: 10001
I, [2021-08-30T00:15:35.194896 #23572]  INFO -- : Acronym: OMV, Results: 28, Start Index: 10001
I, [2021-08-30T00:15:39.243969 #23572]  INFO -- : Acronym: TMO, Results: 59, Start Index: 10001
I, [2021-08-30T00:15:45.490393 #23572]  INFO -- : Acronym: OPMI, Results: 10, Start Index: 10001
I, [2021-08-30T00:15:52.090826 #23572]  INFO -- : Acronym: OFSMR, Results: 50, Start Index: 10001
I, [2021-08-30T00:16:06.683663 #23572]  INFO -- : Acronym: MOOCCUADO, Results: 6, Start Index: 10001
I, [2021-08-30T00:16:08.408231 #23572]  INFO -- : Acronym: SNOMED_ANATOMY, Results: 28, Start Index: 10001
I, [2021-08-30T00:16:22.629018 #23572]  INFO -- : Acronym: DISTEST, Results: 0, Start Index: 10001

... with data for only 11 ontologies retrieved. Analytics data isn't written to Redis until data for all ontologies is retrieved, so if there's a failure somewhere in the retrieval, no new data is stored.

graybeal commented 3 years ago

Just capturing as a note for reference if needed: https://developers.google.com/analytics/devguides/reporting/core/v4/limits-quotas#reporting_apis_request_errors

Not clear that it is related to our issues.

alexskr commented 2 years ago

perhaps switching from google-api-client to modern google ruby client https://github.com/googleapis/google-cloud-ruby could help with those issues?

"Client library gems with names that begin with google-apis- are based on an older code generation technology. They talk to a REST/JSON backend (whereas most modern clients talk to a gRPC backend) and they may not offer the same performance, features, and ease of use provided by more modern clients."

jvendetti commented 2 years ago

From @alexskr:

I think I found an easy fix for this issue. My guess is that even though google cloud analytics service doesn’t use 4store under the hood it can have random outages once in a while so application accessing it should handle that gracefully. It turns out that google-api-ruby-client has an option for retry and it is strongly encouraged. https://github.com/googleapis/google-api-ruby-client/blob/main/docs/usage-guide.md#errors--retries

google-api-ruby-client is pretty much a drop-in-replacement for the google-api-client but requires merging my remove-ncbo-resource-index pull requests I made last month.

I made changes to ncbo_cron by hand to verify that solution and it worked.

I merged all of the pull requests that removed the deprecated Resource Index code. This isn't the current blocker though. The newer google-api-ruby-client library requires a more recent version of the addressable gem:

Bundler could not find compatible versions for gem "addressable":
  In Gemfile:
    ncbo_cron was resolved to 0.0.1, which depends on
      goo was resolved to 0.0.2, which depends on
        addressable (= 2.3.5)

    ncbo_cron was resolved to 0.0.1, which depends on
      google-apis-analytics_v3 was resolved to 0.3.0, which depends on
        google-apis-core (~> 0.1) was resolved to 0.4.2, which depends on
          addressable (~> 2.5, >= 2.5.1)

See: https://github.com/ncbo/goo/issues/119

syphax-bouazzouni commented 1 year ago

Seems that this bug is still there (we have also the same issue in Agroportal) image

The solution made before https://github.com/ncbo/ncbo_cron/commit/e4535bd648d9e9cd10207bac4428a95615dc2fa7, of retrying 5 times, if failed, seems to not be enough.

I didn't really inspect this in detail (so I don't have a solution to propose), but I noticed that it always occurred at the beginning of each month (e.g now the start of May makes the visits in April null)

graybeal commented 1 year ago

I think this month is when Google Analytics old style dies.

jvendetti commented 1 year ago

This is the last communique from Google Analytics (posted to admin at biontology.org):

Universal Analytics standard properties will stop processing data on July 1, 2023. To maximize historical data and to ensure that Google Analytics 4 (GA4) meets your specific needs, we recommend you make the switch to Google Analytics 4 now. This will give you the opportunity to customize the setup including using the latest site tag.

For any customer who does not set up a GA4 property with basic settings, starting in March, we will configure one with a few basic settings consistent with the existing Universal Analytics property; this includes certain conversion events, Google Ads links, and existing website tags.

jvendetti commented 1 year ago

Echoing @syphax-bouazzouni's comment from yesterday - this issue still appears to be present.

The scheduler-ontology-analytics.log file on the production parsing box shows no errors, but analytics data isn't generated for all ontologies. For example, on May 1st the script generated results for just under 800 ontologies, then lacks any further output.

jvendetti commented 1 year ago

@mdorf - I ran the ncbo_ontology_analytics_rebuild script in production, which completed successfully:

...
I, [2023-05-04T15:34:44.257781 #10495]  INFO -- : Acronym: COVID19-IBO, Results: 7, Start Index: 10001
I, [2023-05-04T15:34:48.416028 #10495]  INFO -- : Acronym: MONDO, Results: 215, Start Index: 10001
I, [2023-05-04T15:34:52.708914 #10495]  INFO -- : Acronym: CEPH, Results: 3, Start Index: 10001
I, [2023-05-04T15:34:56.993491 #10495]  INFO -- : Acronym: HTN, Results: 8, Start Index: 10001
I, [2023-05-04T15:34:56.993641 #10495]  INFO -- : Completed ontology analytics refresh...
Completed rebuilding ontology analytics repository in 58.6 minutes.

... but the visits chart on the BP home page is still blank, even after clearing the caches.

I'm also seeing zeroes from the REST API from April onwards, e.g.:

https://data.bioontology.org/ontologies/MEDDRA/analytics

"2023":{
  "1":31245,
  "2":21387,
  "3":20252,
  "4":0,
  "5":0,
  "6":0,
  "7":0,
  "8":0,
  "9":0,
  "10":0,
  "11":0,
  "12":0
}
mdorf commented 1 year ago

The original issue was not the cause of the problem in our case. The analytics stats weren't showing up because we introduced a new Redis server in the config var:

config.ontology_analytics_redis_host

However, our CRON Analytics job was pointing to:

config.redis_host

Until recently, both were pointing to the same server, so we never encountered the bug until now.

I have NOT seen any of the earlier mentioned Google API errors in my latest tests.

syphax-bouazzouni commented 12 months ago

Is this fixed? If yes can you please tell me the commits/issue/PR to see?

alexskr commented 12 months ago

yes, this issue is ultimately fixed with the implementation of https://github.com/ncbo/ncbo_cron/issues/70