ISISComputingGroup / IBEX

Top level repository for IBEX stories
5 stars 2 forks source link

ICAT: files not appearing in catalogue #4505

Closed davidkeymer closed 5 years ago

davidkeymer commented 5 years ago

First noticed on ARGUS, but affected all instruments. Appeared to have stopped at some point over the weekend of 6/7th.

Fault traced to the archive "crawler" process on the ICAT machine (managed by SCD, but located in an ISIS server room) that ingests files in the ISIS archive into the catalogue. Possibly caused by changes to the access shares for one instrument on the day before.

Remedied by SCD contact restarting the crawler process to enable the catalogue to catch up with the backlog of new data files.

User who reported this suggested adding an alert for this service. Could be either local NAGIOS or SCD alert system, or both.

davidkeymer commented 5 years ago

Sounds a very similar problem and solution as detailed in #4109.

DominicOram commented 5 years ago

There's no mention of TOPCAT on the developers wiki. I personally only know a little bit about it and so would have struggled to do this support ticket. Can you add a page to the wiki with enough info so that others could diagnose similar problems in the future?

Has the NAGIOS check now been added? Can we do that as part of this ticket or make a new ticket to do it?

FreddieAkeroyd commented 5 years ago

There are nagios checks that the TOPCAT people asked us to add for their benefit and they email them directly, however these checks only flagged issues when the service was restarted monday. I'm not sure if the checks are emailing the current people involved in TOPCAT, but as the checks did not spot the problem we need them to send us another system or service to check

FreddieAkeroyd commented 5 years ago

There is an ICAT entry in http://sparrowhawk.nd.rl.ac.uk/footprints/ that can be used to report issues, i will add TOPCAT to the description here too

FreddieAkeroyd commented 5 years ago

Or email isisdata@stfc.ac.uk

davidkeymer commented 5 years ago

The issue that this ticket refers to was resolved by the responsible member of SCD restarting the ingestion process.

To monitor this process, a separate ticket #4520 has been created.

DominicOram commented 5 years ago

I still can't find any documentation on the wiki about this. It doesn't need much just:

davidkeymer commented 5 years ago

Page created under Trouble Shooting

DominicOram commented 5 years ago

Great, thanks David!