hubmapconsortium / ingest-api

MIT License
0 stars 0 forks source link

Endpoint to support submitting multiple Datasets for processing #513

Closed shirey closed 2 months ago

shirey commented 4 months ago

Create an endpoint that allows for submitting multiple Datasets for processing by taking as input a list of dataset uuids. This should mimic the existing /datasets/\<uuid>/submit endpoint that allows for submission of a single Dataset. The update/PUT functionality of the existing single Dataset submit endpoint allowing the data in the dataset to be changed should NOT BE replicated for this multiple submit endpoint (only the list of Dataset uuids to be submitted for processing is needed as input).

This is in support of the new Ingest Board feature to bulk submit datasets for processing

A similar endpoint will be created in SenNet ingest-api- this should have the same request/response pattern/specifics.

ChuckKollar commented 3 months ago

Notes from the Mar 8, 2024 11am meeting with CMU

This is calling "airflow". Our endpoint will send a 202 when we get the data, and will not wait till we completely process it. Two calls are made: after status change is made; after processing. The two seteps are: 1) set datasets to processing status and contact airflow; 2) get information back from airflow and set that information in the database. If we get an unexpected result from the Airflow endpoint we will set them to the error state. We will get a vector back from airflow.

When we send multiples items will we get a mix of valid and invalid? Will return the same thing that they are turning now, but in an array. The ones that failed will be marked to the error state, and the others will be processed. If we get nothing back or an error (500) back then we set them all to error.

Note there is a 5 minute timeout imposed by nginx since this is not behind the gateway.

Previous meeting notes: https://docs.google.com/document/d/13OAM90-KmfEphQnqKgOvM8RYFF_49IC4wK7EybnACPA/edit?pli=1

ChuckKollar commented 3 months ago

Talk with Tyler about what SendNet did.... According to Bill I should just be able to copy his (more or less).

ChuckKollar commented 3 months ago

PR: https://github.com/hubmapconsortium/ingest-api/pull/534 PR: https://github.com/hubmapconsortium/gateway/pull/309

ChuckKollar commented 2 months ago

It appears that the neo4j Result object does not have a .fetch(n) method so I returned a list of dictionaries. Not sure why this works for SenNet.... PR: https://github.com/hubmapconsortium/ingest-api/pull/546