plone / plone.restapi

RESTful API for Plone.
http://plonerestapi.readthedocs.org/
84 stars 73 forks source link

TUS temporary file optimization (mv instead of read on same filesystem) #1690

Open JeffersonBledsoe opened 10 months ago

JeffersonBledsoe commented 10 months ago

Plan

netlify[bot] commented 10 months ago

Deploy Preview for plone-restapi ready!

Name Link
Latest commit 0792e4771b685759a6bd6dc95e774d851cc2354e
Latest deploy log https://app.netlify.com/sites/plone-restapi/deploys/668a43840674b2000804626d
Deploy Preview https://deploy-preview-1690--plone-restapi.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

mister-roboto commented 10 months ago

@JeffersonBledsoe thanks for creating this Pull Request and helping to improve Plone!

TL;DR: Finish pushing changes, pass all other checks, then paste a comment:

@jenkins-plone-org please run jobs

To ensure that these changes do not break other parts of Plone, the Plone test suite matrix needs to pass, but it takes 30-60 min. Other CI checks are usually much faster and the Plone Jenkins resources are limited, so when done pushing changes and all other checks pass either start all Jenkins PR jobs yourself, or simply add the comment above in this PR to start all the jobs automatically.

Happy hacking!

davisagli commented 10 months ago

@JeffersonBledsoe NamedBlobFile uses IStorage adapters to process the value that is passed in (https://github.com/plone/plone.namedfile/blob/master/plone/namedfile/storages.py). It might make sense to implement this as an IStorage adapter for TUSUpload.

it also bypasses DX validation of the blob as that seems to read this into memory for some reason. instead we should work out how to still validate but not read the data unless we really have to

Yeah, I don't think we can bypass validation entirely. If the entire file is read during validation that's surprising, have you identified where that is happening?

djay commented 7 months ago

@davisagli I've put in the storage adapter and that works. I've also put in a test to show it works.

But the validation does actually read the whole file into memory (also shown in the test).

The cause is

Note this is a problem for NamedImageBlob too. and this is likely a problem for any editing of a File object or other reasons files might be validated

Possible solutions?

Not really sure the right way to solve this yet. ideas?

djay commented 7 months ago

I also suspect that some of the other storages like FileUploadStorage might also be be already buffered into a local file at least some of the time. In that case, the file can be moved rather than read again improving performance. I haven't looked into this yet however.

djay commented 7 months ago

@davisagli I had a go at changing the validation to avoid reading in the whole file in https://github.com/plone/plone.namedfile/pull/155