IMAP-Science-Operations-Center / sds-data-manager

MIT License
0 stars 10 forks source link

Create API endpoints for job tracking and status #316

Open bourque opened 2 months ago

bourque commented 2 months ago

In order to support https://github.com/IMAP-Science-Operations-Center/imap_processing/issues/724, we need to build new API endpoint(s) that return information about the processing jobs that have been run. Something like /job-statuses to return a list of processing jobs and their current status, and /job-log?<job_id> to return the processing log for the particular job from the AWS logs.

greglucas commented 2 months ago

Just to note, I think REST apis generally encourage the <id> as a path param. https://stackoverflow.com/questions/30967822/when-do-i-use-path-parameters-vs-query-parameters-in-a-restful-api

So I think this would look something like: /jobs?start_date= to filter all the jobs by dates, or whatever other query params you want. /jobs/:id to get the information for a specific job.

greglucas commented 1 week ago

@bourque are you able to ping Ransom and Jenny here for input on what they'd need or how they want this to look? I think they did something similar on EMM.

Now that we have L3 code running, I think I'd prefer if this was prioritized higher on the web team front so that we aren't having to send external people job logs and doing more work. So hopefully this would be a good way for them to view this stuff without us needing to intervene, and potentially help us to investigate job success/failure sooner as well.

Accessing job status / logs through boto3 connections

List jobs with various requests: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/batch/client/list_jobs.html which will get you the job ids. Then the frontend could turn that around and re-send requests (or we could do it right away int he same request) to get lots of information: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/batch/client/describe_jobs.html including the log stream 'logStreamName': 'string',

Then get the log content from a separate boto3 client: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/logs.html

bourque commented 1 week ago

@greglucas To answer your question, I think the best medium for communication for something like this will end up being the imap-sdc-website slack channel.

They did indeed do something similar on EMM. EMM appears to have an API for this that returns a simple JSON object with processing status metadata which can then be displayed through a table. Since they have the mechanics of this figured out on their end already, it sounds like it should be relatively straightforward for them to implement for IMAP. As such, the priority should probably be placed on us to get that API up and running.