ncbo / ncbo_cron

Jobs that run on a regular basis in the NCBO infrastructure
Other
2 stars 6 forks source link

Intermittent RemoteFileException during nightly pull for NCIT ontology #36

Closed jvendetti closed 2 years ago

jvendetti commented 4 years ago

Received a report from Gilberto at NIH that BioPortal intermittently notifies them of failures to access the NCIT ontology source file from their pull location:

I've been receiving these notices frequently, but when I go to check I have no problems accessing/downloading the vocab. Could it be some conflict with the nightly time when our systems group does maintenance, backup, or some related activity?

Investigated the log files for our nightly pull process and found 14 instances of failures to access a file at their pull location since January of this year, e.g.:

I, [2019-06-16T18:08:12.403175 #22875]  INFO -- : RemoteFileException: No submission file at pull location ftp://ftp1.nci.nih.gov/pub/cacore/EVS/rdf/Thesaurus.owl for ontology NCIT.

Our nightly pull process begins at 6PM Pacific, and attempts to check for new versions of NCIT generally occur between 6:06 and 6:11PM Pacific. Gilberto mentioned that this overlaps with their backup process that launches nightly at 9PM Eastern:

This server is part of a cluster and the nightly backups start at 9 PM eastern, it might be first or last in the queue, which explains the randomness as well as the time of the failures

jvendetti commented 3 years ago

I suggested to Gilberto that we try moving the start time of our nightly pull process to 8PM Pacific. He is in agreement with moving foward:

The engineer says:

"The sub-client for [backups of this ftp server] is finished by 11 most nights but not all"

It's probably good to give this time a shot (8 pm pacific). Likely there will be the occasional failure but maybe not as often as at 6 pm pacific.

jvendetti commented 2 years ago

As far as I can tell, the start time of the nightly pull process was never moved from 6:00PM Pacific. Recent examination of the nightly pull log files shows no occurrences of RemoteFileException errors for NCIT. There also haven't been any follow-up complaints from Gilberto about this issue recurring. Considering this closed for now.