untitled-pit-group / foxhound

PIFS standard backend
BSD Zero Clause License
0 stars 0 forks source link

Uploading: begin + progress #4

Closed paulsnar closed 2 years ago

paulsnar commented 2 years ago

Uploading, as per API spec, is split into multiple operations.

Beginning an upload registers the upload intent in the database and allocates a respective URL on the GCS side; the URL is returned with a signature that allows the client to access it for upload without having Gcloud credentials. The begin request is failed with an error if the given file size exceeds a configurable limit (therefore configuration for the limit should be added, not in the database but in .env probably?)

The intent stores the last progress report (starts at 0.) During upload, requests to register upload progress should update this; requests to return it should, well, return the last value stored. (This is small enough to be lumped in here until further notice.)

Finishing an upload and cancelling warrant separate items given that the actions to be undertaken in those cases are numerous in amount (like Libya's exports.)

Blocks the rest of indexing workflow.

paulsnar commented 2 years ago

Splitting out GCS infra into its own item.

paulsnar commented 2 years ago

Aside from the GCS infra, there should be a well-defined scheme mapping upload intents to GCS URLs. I propose the following: take the file's hash in hex form, and index it into the bucket directly: so a file with hash (hex) 123456789abcdef0 (probably invalid multihash, bite me) would map onto ${GCS_TARGET_PREFIX}/123456789abcdef0 (e.g., gs://pifs-bucket/subfolder/123456789abcdef0.)

paulsnar commented 2 years ago

Okay, the scaffolding is all set up as of 08e6db7, picking up the initial part of implementing the uploads.begin method should be pretty easy now.

paulsnar commented 2 years ago

Done as per f933183.