Open djay opened 11 months ago
@djay The linked images give 404s for me. Is it possible to share them as part of this issue?
At first glance, I like this direction. Thanks for all your notes.
I'd like to see the TUS backend in plone.restapi gain support for storing the temp files in the ZODB instead of raw files. That way it can also work when the ZODB is not using a shared filesystem blobstorage.
@davisagli reuploaded the images.
I'd like to see the TUS backend in plone.restapi gain support for storing the temp files in the ZODB instead of raw files. That way it can also work when the ZODB is not using a shared filesystem blobstorage.
It's a good idea but I think it can be handled by a seperate PR or PLIP? It would double the storage needed until the DB is packed which is why it wasn't done this way originally but I think thats fine for a default unoptimised system.
@djay I would say optimized for horizontal scalability rather than optimized for performance and minimum disk use, but fair point that there's a tradeoff there. And sure, it can be done separately from this PLIP. But it's related in that if the frontend can count on the backend to always support the TUS protocol, then it might simplify the frontend by not needing multiple file upload widgets.
@davisagli good point. I'll put it in there
@davisagli I'd have to check if the protocol supports it but if TUS allows a single chunk in a single api request then we could possibly not have to handle combining them on the backend for the default setup. So then we use the same UI and same client code and still sending via TUS endpoint but the space and transactions used on the backend are the same as without TUS. EDIT: yes it seems to support this - https://tus.io/protocols/resumable-upload#creation-with-upload
Our usage:
Concerns:
Other:
@runyaga TUS was in classic in plone 4. It seems to have got lost along the way - https://github.com/plone/plone.app.content/issues/64 but it would not be hard to bring back.
That implimentation required a temp folder that all instances had access to, ie a shared file system.
Do we want to the TUS implementation to be in Plone proper? I see the benefit, but would/could a first step be integrating the TUS reference implementation? This is a different set of trade offs but the upside is we dont have to maintain the server portion (at least initially).
There is no real trade off if TUS can be set single api call in cases where a shared file system is not available. Then it will not be any different from a normal upload other than being able to be retried and having progress in the UI. I agree with @davisagli that its better to have one implemenation that gets maintained rather than two. As you can see from history, niche code that is only enabled by a few gets forgotten and left out.
This PLIP talks about uploading files but I believe users are expecting to be able to upload directories and/or even nested folders/files.
yes it should possibly deal with that. I think from memory you need to use additional browser apis to get information about folders but I might be wrong. Certainly our implimentation didn't deal with recreating folder structures. Another issue that I think a folder has no mime type so I'm not sure how if there is a way in plone to say what the default content type is for a folder in the case of folderless plone like volto is.
Any concurrency of multiple uploads could potentially reek havoc with transactions
The old and new TUS implimentations upload chunks to a temp folder in a shared filesystem and don't do any transactions until it is finally combined. @davisagli suggested that if no shared filesystem was available then maybe it the chunks could be stored in the ZODB which would then have problems with lots of transactions. I'd argue its safer in such a case to go back to not using chunks and use a single api call upload. You have to reupload the whole thing again on disconnection but I think the transactions problem is worse. TUS can be configured to do parallel uploads but we didn't use it that way adn the old classic way didn't do this either. It wasn't slow.
Big wins are parallelize chunks of files. If we have to send one file at a time its going to be nightmarishly slow.
None of the implementations so far did this. If it was enabled then you'd have to ensure that the deployment config could set a maximum so you could ensure you don't overload your own site. But in theory you could enable this.
Prior conversations with @davisagli was that the Folder Contents changes to integrate something like this is fairly invasive. I would suggest this be worked out first. Drag n Drop into Folder Contents is the primary mode people expect to use the system.
The PLIP is to change the upload view and file add. It could be extended to so a drop on file contents activted the upload view.
People don't expect to "Add File" or even "Click to perform Bulk Upload".
I agree the current upload button is too well hidden. The current "Add file" is much more visible and I'd argue where begginers a led when they want to upload files and hense the reason to remove it and replace it with "Upload files" so that misguided user journey is fixed. But no harm in having DND in folder contents also.
We have some code for volto but it uses another design system. So if we get time we can try to put in a PR. We have to first
if anyone is motivated to work on that before we get to it we will happily share the branch we have.
Do we want to the TUS implementation to be in Plone proper? I see the benefit, but would/could a first step be integrating the TUS reference implementation? This is a different set of trade offs but the upside is we dont have to maintain the server portion (at least initially).
There is no real trade off if TUS can be set single api call in cases where a shared file system is not available. Then it will not be any different from a normal upload other than being able to be retried and having progress in the UI. I agree with @davisagli that it's better to have one implementation that gets maintained rather than two. As you can see from history, niche code that is only enabled by a few gets forgotten and left out.
I do not see a direct answer to the proposition "Could a first step be integrating with the TUS reference impl?" I was thinking about requiring the reference TUS go container to be running alongside Plone. Then the integration really becomes "Use the TUS reference implementation" and we bundle a set of webhooks on the Plone side that know how to work with tusd which, to me, seems like an easier problem to solve.
Some huge positives:
My understanding:
@djay mind sharing me to your TUSD branch?
@runyaga plone restapi supports TUS and has done for some time. https://plonerestapi.readthedocs.io/en/querysources/tusupload.html
The problem is not on the server side. The problem is that both classic and volto don't yet have client side clients.
"in-Plone" uploads participate in ZODB transactions and block at least 1 thread per concurrent file
This is not true. The rest-api current implimentation doesn't do any transactions until the chunks are all uploaded. Instead each request is put into a shared temp folder and only the last TUS upload request results in one big transaction.
There is not clarity on the UI/UX implementation; basically any place where a File widget can exist there is a desire to make it work with TUS toggled on/off (seems like a large ask?)
The first step is to have it working with the contents upload. Working out how to have it working with other file widgets is a harder subsequent task.
The desire is to not have TUS turned on or off but always use it but perhaps change how the transactions work depending on how its deployed since having a shared temp folder is not always possible.
We will share our branch but it's mixed in with a bunch of other stuff right now so we just have to find time to seperate out the volto UI code that just deals with TUS and progress.
@davisagli @sneridagh if this was done with uploady library instead of js-tus-client do you think it would be accepted? We have a implementation using that which just needs some polishing and looking at what uploady provides, using js-tus-client might involve adding what uploady already has. Size wise it doesn't seem so much different. But all depends what you want in core.
@djay It looks worth considering; I can see the benefit of using an existing integration into React rather than a pure-JS library.
@victorchrollo14 is having a go at implementing this
PLIP (Plone Improvement Proposal)
Abstract
This PLIP has three aims (and could be separated if desired)
Context
Motivation
Assumptions
Folder contents > upload
is a currently “hidden feature”Add > File
is too many clicks for most people and it’s not obvious the title is optionalProposal & Implementation
TUS for contents upload
TUS for file fields/blocks
Replace Add > File
UI for "upload files"
propose to make similar to quanta (but with semantic)
Additional things not specified in quanta design
TUS without temp folder
Deliverables
Additional work
Risks
OK/Cancel for "upload files"? ie finish upload/transactions when click save, not after picking.
Size of bundle with extra lib
File Widget
Participants
Context
Related optimisation on the backend to speed up the last part of a large upload