unity-sds / unity-project-management

Container repo for project management (projects, epics, etc)
Apache License 2.0
0 stars 2 forks source link

Airflow S3 Trigger #145

Closed mike-gangl closed 2 months ago

mike-gangl commented 7 months ago

Airflow S3 Trigger

Description of work to be done here

Acceptance Criteria

Acceptance criteria required to implement the epic

Note: don't want this to conflict with U-DS bucket events for catalogging

Work Tickets

Link to work tickets required to implement the epic

Dependencies

Other epics or outside tickets required for this to work

Associated Risks

links to risk issues associated with this epic

rtapella commented 7 months ago

IMO this should be designed in conjunction with U-DS @ngachung

Uploading data, ingestion, and notification (SQS/SNS) should probably be in U-DS.

Then ADS/airflow/etc could just be looking for a particular SNS topic to be notified that files have arrived, plus the lambda/executable that triggers when an appropriate notification says “data is here”

rtapella commented 7 months ago

If it’s a point of integration across U-DS and U-ADS… should we have some sort of unity.py support? (E.g., to ingest files and mark them to trigger a job?)

LucaCinquini commented 7 months ago

In my experience, this sort of triggers is part of the PCM framework, because the logic is very specific to each mission. So it would naturally be part of the SPS. But we can discuss.

rtapella commented 7 months ago

Yes I agree that the SPS would be where the trigger logic should lie.

I think that the Data Store is where the notifications about file tasks should happen (SNS/SQS/redis related to S3, NFS, whatever). Multiple consumers of the file notifications probably exist (other DS services like file browsers, triggers for jobs in SPS, potentially Jupyter/IDE? Or something like the equivalent to a cloud fswatch).