geneontology / pipeline

Declarative pipeline for the Gene Ontology.
https://build.geneontology.org/job/geneontology/job/pipeline/
BSD 3-Clause "New" or "Revised" License
5 stars 5 forks source link

NEO having trouble building in most cases #312

Open kltm opened 1 year ago

kltm commented 1 year ago

Recently (last successful build on 2022-12-13), NEO has been unable to complete. Consistently, this issue has been related to a broken pipe in the Solr load. E.g.

13:16:52  2023-01-11 21:16:52,175 WARN  (OWLGraphWrapperExtended:936) Unable to retrieve the value of oboInOw#id as the identifier for http://www.w3.org/2002/07/owl#Thing; we will use an original iri as the identifier.
13:16:52  2023-01-11 21:16:52,213 INFO  (FlexCollection:253) Loaded: 1788000 of 1978115, elapsed: 3:56:20.531, eta: 0:22:10.805
13:16:52  2023-01-11 21:16:52,214 INFO  (FlexSolrDocumentLoader:47) Processed 1000 flex ontology docs at 1788000 and committing...
13:39:13  Exception in thread "main" org.apache.solr.client.solrj.SolrServerException: java.net.SocketException: Broken pipe (Write failed)
13:39:13    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:475)
13:39:13    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:249)
13:39:13    at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)

This has likely been going on longer, but went unnoticed due to #309 .

This may be related to more entities getting loaded and slowly scaling out of an already borderline situation.

kltm commented 1 year ago

As the number of entities is rather smaller than the main Solr loads, it is likely ontology-specific.

kltm commented 1 year ago

@vanaukenk @balhoff Okay, while experimenting would be slow, an increase in memory limits did get a load through. I want to explore a little more before closing this out though, as this process (solr + owltools) took north of 0.5TB RAM to get through--there is something degenerate happening that I'd like to understand a little more. ...but, we can make loads again apparently.