Swirrl / ook

Structural search engine
https://search-prototype.gss-data.org.uk/
Eclipse Public License 1.0
6 stars 0 forks source link

Schedule ETL with reporting #127

Open Robsteranium opened 1 year ago

Robsteranium commented 1 year ago

With #115 the ETL process ought to be quick enough to run overnight. We could create a systemd timer to fire the etl service e.g. in etl.timer:

[Unit]
Description=OOK ETL overnight batch job

[Timer]
OnCalendar=daily
AccuracySec=12h

[Install]
WantedBy=timers.target

Note that the matching filename should cause systemd to find/ fire the etl.service. We might like to coordinate the exact hour (here it defaults to 0:00) to avoid competing with other services on the Muttnik box that might be running batch jobs overnight.

If we're doing this then we ought also have some way to report failures otherwise they'll go unnoticed and we could end-up breaking the site/ burdening stardog unintentionally. An email would probably suffice but it may be simplest to just follow what Muttnik is using for this - does DataDog do alerts @RicSwirrl? Indeed I think there are some vestiges of DataDog provisioning taken from Muttnik.