Closed amywieliczka closed 7 months ago
Yeah, I tried to find the right level of coupling - I'd like to discuss that further.
This looks good to me too. It seems straightforward and flexible in terms of using rikolti_message
to pass information to the registry that's not available in the airflow context
.
I know Amy has brought up the question of how to handle clearing/rerunning of tasks and how that makes things messy w/r/t to the display in registry. But yeah, that issue just kept on popping into my head as I was thinking this through.
I see what you both mean about the tight coupling of the registry model to the airflow model. Seems like a good thing to discuss sooner rather than later just to make sure we don't bake ourselves into a corner.
Rikolti publishes airflow event messages to SNS in the following format:
https://github.com/ucldc/rikolti/blob/390a0fa49d9e02636ba5ff21af35bd2b904090c3/dags/shared_tasks/shared.py#L28-L46
SNS sends Rikolti airflow event messages along to SQS in the following SNS message format:
The
rikolti_status
management command polls the SQS queue for a list of new SQS messages, each of which contains an SNS message, each of which contains a Rikolti Airflow Event Message, each of which contains a Rikolti Message. For every SQS message that comes in,rikolti_status
first tries to find or create a HarvestRun, then creates a HarvestEvent.Finally,
rikolti_status
tries to understand the status of the HarvestRun for which we've just received a new event. A HarvestRun status can either berunning
,succeeded
, orfailed
. If we've just received a new event for the HarvestRun, then it is presumed to be running - an event cannot be sent if it is not running. After the creation of the new HarvestEvent,rikolti_status
sorts all events for the HarvestRun by theirsns_timestamp
to retrieve the most recent event. If the most recent event'srikolti_message
contains the special keyserror
ordag_complete
, then the HarvestRun status is set tofailed
orsucceeded
, respectively.The rest of this PR has to do with the display of all this information in the Admin interface. Notably:
django-json-widget
package in order to display JSON in the admin interface. These are TextFields in the model, though.RIKOLTI_EVENTS_QUEUE_URL
andAWS
. In the dev environment'stest_settings.py
, this is retrieved from the local environment variable of the same names. In production, this will be written intolocal_settings.py
, following convention for the Registry.