EBISPOT / OLS

Ontology Lookup Service from SPOT at EBI
http://www.ebi.ac.uk/ols
Apache License 2.0
97 stars 40 forks source link

How often does OLS update the ontologies? #362

Closed srobb1 closed 4 years ago

srobb1 commented 4 years ago

Hello,

I was wondering how often OLS updates the ontologies? We created a new release of our ontoogy (PLANA) last Monday (4/27/2020) and it has not been updated in OLS yet.

Thank you, Sofia

henrietteharmse commented 4 years ago

Plana is now updated on OLS. We had an issue wrt updates that we have addressed.

In general, how often the indexes get updated is dependent on the number of ontologies that need to be re-indexed. This can can result in indexing running for several days. The indexer only takes updates into consideration that were available before indexing started. Thus, if an update becomes available after indexing already has started, the update will only be incorporated the next time the indexer runs. This means updates can take up to 2 weeks to reflect on OLS.

srobb1 commented 4 years ago

Thank you @henrietteharmse.

giraygi commented 3 years ago

Plana is now updated on OLS. We had an issue wrt updates that we have addressed.

In general, how often the indexes get updated is dependent on the number of ontologies that need to be re-indexed. This can can result in indexing running for several days. The indexer only takes updates into consideration that were available before indexing started. Thus, if an update becomes available after indexing already has started, the update will only be incorporated the next time the indexer runs. This means updates can take up to 2 weeks to reflect on OLS.

@henrietteharmse I have a loosely related question for our running instances. Do you need to reindex all ontologies when new ontologies are added or can you index them one by one?

henrietteharmse commented 3 years ago

@giraygi No. We only reindex new ontologies. In fact, we estimate that to reindex all ontologies in OLS will take 3-4 weeks.

giraygi commented 3 years ago

@giraygi No. We only reindex new ontologies. In fact, we estimate that to reindex all ontologies in OLS will take 3-4 weeks.

@henrietteharmse We want to incorporate the capability of loading and indexing only the new coming ontologies to our TS instance. But we cannot do that with the regular docker based pipeline due to the changes in ontology labels in the indexing phase. I will really appreciate if you can explain us how you are achieving that in your pipeline.

henrietteharmse commented 3 years ago

@giraygi We do not use docker in production due to constraints of the EBI infrastructure and OLS tightly coupled architecture.

We have a complete preparation environment and a production environment as detailed in the architecture diagram.

A scheduled process on preparation will bring tomcat for web app and API down and run the configurator followed by the indexer. As-is OLS configurator and indexer will only run indexing for ontologies that changed. Once indexing on preparation is complete, the resulting Solr & Neo4j indexes are copied to prod.

jamesamcl commented 3 years ago

But we cannot do that with the regular docker based pipeline due to the changes in ontology labels in the indexing phase.

@giraygi what exactly do you mean by this?

giraygi commented 3 years ago

@henrietteharmse Thank you very much for the information. I think I have a better understanding of it now. @udp I think I confused the issue and my alternative solutions a bit. ( I had an assumption that it was required to create the neo4j instance all over but when I removed its files only new ontologies were added) Sorry for that. The actual problem was that I was able to make a 2nd import successfully to a running system but when I started the respective indexing task of this 2nd import, I got an error like this: Error connecting to Neo4j embedded database, defaulting to /tmp/emptyOlsGraph. Note this will most likely be an empty Neo4j graph java.lang.RuntimeException: Error starting org.neo4j.kernel.EmbeddedGraphDatabase, /mnt/neo4j .... .... Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Component 'org.neo4j.kernel.StoreLockerLifecycleAdapter@143851bb' was successfully initialized, but failed to start. Please see attached cause exception. at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.start(LifeSupport.java:513) at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:115) at org.neo4j.kernel.InternalAbstractGraphDatabase.run(InternalAbstractGraphDatabase.java:330) ... 132 common frames omitted Caused by: org.neo4j.kernel.StoreLockException: Unable to obtain lock on store lock file: /mnt/neo4j/store_lock. Please ensure no other process is using this database, and that the directory is writable (required even for read-only access)

This was probably due to the lock mechanism of the embedded neo4j instance. After I got this error, I was able to complete the installation and in the resulting service I was able to make an indexed search but I was not able to visualize the new ontologies in the frontend, Then I realized that I had to stop the ols-web container before starting the 2nd importing and 2nd indexing tasks. When I did this, I kept on getting an connection error to the embedded graph database but then I was able to successfully upload and index the ontologies and visualization turned to be ok. So, now I think I am able to carry out multiple importing and indexing tasks and I don't see any flaws in the output anymore. But I keep on getting the error. Is it the same with you?

BTW, I had also tried to remove store_lock and lock files manually from the neo4j instance but I kept on getting the connection error.