agroportal / project-management

Repository used to consolidate documentation about the AgroPortal project and track content related issues.
http://agroportal.lirmm.fr
7 stars 0 forks source link

CO pull location update each day #167

Open syphax-bouazzouni opened 2 years ago

syphax-bouazzouni commented 2 years ago

The CO ontologies are updated each day at 18:00 from there pull URL.

Ontologies concerned :

image

Started from November 2021 (here is the example of CO_325)

image

This may be the cause of this problems :

Todo :

jonquet commented 2 years ago

Long term resolution: avoid to parse ontologies when the source file is exactly the same but retrieved by the automatic pull. See #171

Solution for CO ontologies:

jonquet commented 2 years ago

List of ontologies to process (28):
CO_358, CO_350, CO_357, CO_345, CO_339, CO_335, CO_338, CO_348, CO_325, CO_331, CO_346, CO_341, CO_330, CO_327, CO_360, CO_322, CO_337, CO_366, CO_321, CO_340, CO_320, CO_324, CO_343, CO_365, CO_334, CO_336, CO_356, CO_323,

All ontologies unplugged => pullLocation" : ""

Notes:

Also unplugged POLAPGEN_BARLEY, CO_121, CO_020 which pullURLs were generating an error in the log (but no notification email). To be fixed when we will resume all pullURLs

syphax-bouazzouni commented 2 years ago

@jonquet so after checking the code to know more about how is the ncbo_cron job figuring out, that a new version of an ontology was released. And i found that it's not looking for the http header but download every day the ontologies and hash it to compare it with the local ones (see code below source : https://github.com/ontoportal-lirmm/ncbo_cron/blob/master/lib/ncbo_cron/ontology_pull.rb#L54)

image

syphax-bouazzouni commented 2 years ago

Summary of What Todo (after the last updates)

jonquet commented 2 years ago

After discussion with @marieALaporte we will either :