ncbo / ncbo_cron

Jobs that run on a regular basis in the NCBO infrastructure
Other
2 stars 6 forks source link

reduce frequency of sending email notifications for ontology pull failures. #72

Open alexskr opened 10 months ago

alexskr commented 10 months ago

Ontology pull mechanism sends a notification to ontology owner when pull fails. This notification can be useful to alert ontology owners about problems; however, it can be too noisy.

I would like to propose the following changes:

  1. Send email only after n times of failures. Sometimes pull can fail due to a temporary/transient issue so notifying end users in those cases can be annoying.
  2. Stop sending notification about pull failures everyday if pull location is bad. Currently we send notification every day for ontologies with bad pull location which can be annoying even for responsible users. We should send it once and perhaps send a follow after n times of failures but sending it every day for ever is a bit of an overkill.
syphax-bouazzouni commented 10 months ago

The difficulty here is how/where to save the number of failing attempts?

alexskr commented 10 months ago

if we don't want to alter linked data model then perhaps it can be stored in a flat file or redis. Its far from being optimal but should be sufficient.

Also it would be beneficial if ontoportal can flag that pull location is broken for an ontology after many failed attempts and disable pull. Currently cron pull process needlessly wastes resources trying to retrieve ontology which will never work until someone updates pull location

syphax-bouazzouni commented 10 months ago

I upvote for adding it as an attribute in ontologies linked data model, just need to think about it more in-depth.

I think as the OntologySubmission model is overloaded with metadata attributes (in AgroPortal case at least), we should have a new object(model) to store this sort of thing as Error states, log paths, ... everything not directly related to the metadata. More related to submission process info.

But if you plan to do it, in a relatively small time. You can just add a new attributes to OntologySubmission model, called for example attribute :pullLocationRetiries, enforce: [:integer] that will be incremented each time a pullLocation did not load succefully (here: https://github.com/ncbo/ncbo_cron/blob/master/lib/ncbo_cron/ontology_pull.rb#L47C11-L47C11)

jonquet commented 10 months ago

Eventually yes, we would need to separate "ontology/submission admin" attributes from other related to the description/metadata of an ontology/submission. Such a parameter (number of retries) is definitively an admin thing... so why not kick off the creation of a new object in the model.

Note that this is not a high priority to me, there are more risk for us to loos our users because of system failure or no innovation in terms of the service we provide... than in the number of emails they get... when they are already not working on ontologies anymore ...