Open klapaukh opened 6 years ago
I'd be wary of embedding file data in Job JSON (e.g. as Base64 encoded strings). I'd favour the file being separately uploaded and it's metadata added to the job (perhaps URI / access identifier, size, status).
Where are Azure storage interactions being handled at the moment? I wonder if a reasonable approach would be to have a StorageManager
as well as a JobManager
for each job. This would represent a storage interface with Create / Read / Update / Delete semantics.
Ideally, uploads could occur without going via the middleware (e.g. Azure supports direct upload and there are libraries to do chunked / resumable / parallel uploads in javascript. There may be similar for SCP from browser). However, going via the middleware is fine for the industry trial prototype. We can add direct uploads later if needed.
I agree with the general idea. I don't want the middleware to do the uploading. And I don't really want to do the Base64 encoded data in JSON thing if I can avoid it.
My current ideal would be that the front end asks the middleware for an Azure access key that allows limited time write permission to a single specified file, and then the front end writes it there. It then informs the middleware when it believes that the upload is complete.
Practically the middleware will probably route that request to the Job Manager
as in the current version that is the system which has the ability to generate Azure tokens. However, if it turns out we need more data management that is not best done through the Job Manager
, I agree that we may need to investigate a potential Storage Manager
.
I do think that whether or not we have a separate Storage Manager
we build the system so that the front end only talks to the middleware and that routes request to whatever backend component is responsible.
Cases need to support input fields which are actually a file upload. This is particularly critical for BEMPP.