alan-turing-institute / simulate-middleware

Simulate middleware service.
http://simulate.readthedocs.io
0 stars 0 forks source link

File input type in case #7

Open klapaukh opened 6 years ago

klapaukh commented 6 years ago

Cases need to support input fields which are actually a file upload. This is particularly critical for BEMPP.

martintoreilly commented 6 years ago

I'd be wary of embedding file data in Job JSON (e.g. as Base64 encoded strings). I'd favour the file being separately uploaded and it's metadata added to the job (perhaps URI / access identifier, size, status).

Where are Azure storage interactions being handled at the moment? I wonder if a reasonable approach would be to have a StorageManager as well as a JobManager for each job. This would represent a storage interface with Create / Read / Update / Delete semantics.

Ideally, uploads could occur without going via the middleware (e.g. Azure supports direct upload and there are libraries to do chunked / resumable / parallel uploads in javascript. There may be similar for SCP from browser). However, going via the middleware is fine for the industry trial prototype. We can add direct uploads later if needed.

klapaukh commented 6 years ago

I agree with the general idea. I don't want the middleware to do the uploading. And I don't really want to do the Base64 encoded data in JSON thing if I can avoid it.

My current ideal would be that the front end asks the middleware for an Azure access key that allows limited time write permission to a single specified file, and then the front end writes it there. It then informs the middleware when it believes that the upload is complete.

Practically the middleware will probably route that request to the Job Manager as in the current version that is the system which has the ability to generate Azure tokens. However, if it turns out we need more data management that is not best done through the Job Manager, I agree that we may need to investigate a potential Storage Manager.

I do think that whether or not we have a separate Storage Manager we build the system so that the front end only talks to the middleware and that routes request to whatever backend component is responsible.