ncbo / bioportal-project

Serves to consolidate (in Zenhub) all public issues in BioPortal
BSD 2-Clause "Simplified" License
7 stars 5 forks source link

AG: metrics calculations fail for most of the mid-size to large ontologies. #213

Closed alexskr closed 10 months ago

alexskr commented 3 years ago

When using AlegroGraph backend metrics calculations fail for a large number of ontologies. Processing hangs for 3 hours and eventually fails.

I, [2021-06-14T07:35:05.580490 #61418]  INFO -- : ["metrics_for_submission start"]
E, [2021-06-14T10:42:24.393935 #61418] ERROR -- : ["too many connection resets (due to Net::ReadTimeout with #<TCPSocket:(closed)> - Net::ReadTimeout) after 956 requests on 48037320, last used 10000.101094032 seconds ago"]
E, [2021-06-14T10:42:24.394016 #61418] ERROR -- : [#<Net::HTTP::Persistent::Error: too many connection resets (due to Net::ReadTimeout with #<TCPSocket:(closed)> - Net::ReadTimeout) after 956 requests on 48037320, last used 10000.101094032 seconds ago>]
E, [2021-06-14T10:42:24.394778 #61418] ERROR -- : ["NoMethodError: undefined method `id=' for nil:NilClass\n/srv/ontoportal/ncbo_cron/vendor/bundle/ruby/2.6.0/bundler/gems/ontologies_linked_data-de150ab9388c/lib/ontologies_linked_data/models/ontology_submission.rb:1126:in `process_metrics'\n\t/srv/ontoportal/ncbo_cron/vendor/bundle/ruby/2.6.0/bundler/gems/ontologies_linked_data-de150ab9388c/lib/ontologies_linked_data/models/ontology_submission.rb:1058:in `process_submission'\n\t/srv/ontoportal/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:177:in `process_submission'\n\t/srv/ontoportal/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:47:in `block in process_queue_submissions'\n\t/srv/ontoportal/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:41:in `each'\n\t/srv/ontoportal/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:41:in `process_queue_submissions'\n\t/srv/ontoportal/ncbo_cron/bin/ncbo_cron:246:in `block (3 levels) in <main>'\n\t/srv/ontoportal/ncbo_cron/lib/ncbo_cron/scheduler.rb:65:in `block (3 levels) in scheduled_locking_job'\n\t/srv/ontoportal/ncbo_cron/lib/ncbo_cron/scheduler.rb:51:in `fork'\n\t/srv/ontoportal/ncbo_cron/lib/ncbo_cron/scheduler.rb:51:in `block (2 levels) in scheduled_locking_job'\n\t/srv/ontoportal/ncbo_cron/vendor/bundle/ruby/2.6.0/gems/mlanett-redis-lock-0.2.7/lib/redis-lock.rb:43:in `lock'\n\t/srv/ontoportal/ncbo_cron/vendor/bundle/ruby/2.6.0/gems/mlanett-redis-lock-0.2.7/lib/redis-lock.rb:234:in `lock'\n\t/srv/ontoportal/ncbo_cron/lib/ncbo_cron/scheduler.rb:50:in `block in scheduled_locking_job'\n\t/srv/ontoportal/ncbo_cron/vendor/bundle/ruby/2.6.0/gems/rufus-scheduler-2.0.24/lib/rufus/sc/jobs.rb:230:in `trigger_block'\n\t/srv/ontoportal/ncbo_cron/vendor/bundle/ruby/2.6.0/gems/rufus-scheduler-2.0.24/lib/rufus/sc/jobs.rb:204:in `block in trigger'\n\t/srv/ontoportal/ncbo_cron/vendor/bundle/ruby/2.6.0/gems/rufus-scheduler-2.0.24/lib/rufus/sc/scheduler.rb:430:in `block in trigger_job'"]
graybeal commented 3 years ago

OK, that's very interesting. The same issue we have been dealing with in production, right? Looks like we'll have to dig into this, but maybe using AG tooling it will be easier.

alexskr commented 3 years ago

Example of ontologies that have problems with metrics: AGRO

alexskr commented 3 years ago

ARGO metrics completes. However, when running metrics for larger ontologies like NCIT we see this behavior where two out of ten AllegroGraph backends are using 100% CPU and stay in this state for hours even after metrics process ERRORs out.

AllegroGraph Admin Interface shows the following under "Jobs":

Jobs
ontoportal (backend) - stop SELECT ?x0 ?x1 ?x2 ?x3 ?x4 ?x5 ?x6 ?x7 ?x8 ?x9 ?x10 ?x11 ?x12 ?x13 WHERE {
  GRAPH <http://data.bioontology.org/ontologies/NCIT/submissions/1> {
    ?x0 <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C7057>.
?x1 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x0.
?x2 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x1.
?x3 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x2.
?x4 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x3.
?x5 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x4.
?x6 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x5.
?x7 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x6.
?x8 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x7.
?x9 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x8.
?x10 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x9.
?x11 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x10.
?x12 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x11.
?x13 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x12
  } } LIMIT 1
ontoportal (backend) - stop SELECT ?x0 ?x1 ?x2 ?x3 ?x4 ?x5 ?x6 ?x7 ?x8 ?x9 ?x10 ?x11 ?x12 WHERE {
  GRAPH <http://data.bioontology.org/ontologies/NCIT/submissions/1> {
    ?x0 <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C7057>.
?x1 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x0.
?x2 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x1.
?x3 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x2.
?x4 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x3.
?x5 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x4.
?x6 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x5.
?x7 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x6.
?x8 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x7.
?x9 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x8.
?x10 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x9.
?x11 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x10.
?x12 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x11
  } } LIMIT 1
ontoportal (backend) - stop SELECT ?x0 ?x1 ?x2 ?x3 ?x4 ?x5 ?x6 ?x7 ?x8 ?x9 ?x10 ?x11 WHERE {
  GRAPH <http://data.bioontology.org/ontologies/NCIT/submissions/1> {
    ?x0 <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C7057>.
?x1 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x0.
?x2 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x1.
?x3 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x2.
?x4 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x3.
?x5 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x4.
?x6 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x5.
?x7 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x6.
?x8 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x7.
?x9 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x8.
?x10 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x9.
?x11 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x10
  } } LIMIT 1
ontoportal (backend) - stop SELECT ?x0 ?x1 ?x2 ?x3 ?x4 ?x5 ?x6 ?x7 ?x8 ?x9 ?x10 WHERE {
  GRAPH <http://data.bioontology.org/ontologies/NCIT/submissions/1> {
    ?x0 <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C7057>.
?x1 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x0.
?x2 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x1.
?x3 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x2.
?x4 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x3.
?x5 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x4.
?x6 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x5.
?x7 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x6.
?x8 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x7.
?x9 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x8.
?x10 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x9
  } } LIMIT 1
ontoportal (backend) - stop SELECT ?x0 ?x1 ?x2 ?x3 ?x4 ?x5 ?x6 ?x7 ?x8 ?x9 WHERE {
  GRAPH <http://data.bioontology.org/ontologies/NCIT/submissions/1> {
    ?x0 <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C7057>.
?x1 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x0.
?x2 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x1.
?x3 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x2.
?x4 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x3.
?x5 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x4.
?x6 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x5.
?x7 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x6.
?x8 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x7.
?x9 <http://www.w3.org/2000/01/rdf-schema#subClassOf> ?x8
  } } LIMIT 1
graybeal commented 3 years ago

Ah, it’s the path to root calculations. I think that’s worth a note to Franz to see what they think about the slowness.

alexskr commented 10 months ago

addressed by moving metrics calculation to owlapi wrapper https://github.com/ncbo/ontologies_api/releases/tag/v5.24.0 https://github.com/ncbo/owlapi_wrapper/releases/tag/v1.4.0