phoenixframework / phoenix_live_view

Rich, real-time user experiences with server-rendered HTML
https://hex.pm/packages/phoenix_live_view
MIT License
6.18k stars 930 forks source link

live_file_input with more than 50-ish files fails #1464

Closed Zurga closed 9 months ago

Zurga commented 3 years ago

Environment

Actual behavior

I am using live_file_input and trying to upload a lot of files at the same time. This code works fine when I upload 5 to 10 images at the same time. I should note that the images are about 3 MB.

The code to allow the upload looks like this (same result with auto_upload: false ):

  107   ┆ |> allow_upload(:report_pictures,
  108   ┆ ┆ accept: ["image/*"],
  109   ┆ ┆ max_entries: 10000,                                                      
  110   ┆ ┆ auto_upload: true,
  111   ┆ ┆ max_file_size: 15_000_000_000,                                           
  112   ┆ ┆ progress: &handle_progress/3                                             
  113   ┆ ) 

My handle_progress/3 is as follows (I know for certain that the first if evaluates to true) :

  198   def handle_progress(
  199   ┆ ┆ ┆ _section_pictures,
  200   ┆ ┆ ┆ entry,
  201   ┆ ┆ ┆ %{assigns: %{reference_section: reference_section, sections: sections}} = socket
  202   ┆ ┆ ) do
  203   ┆ if section_name = reference_section["#{entry.ref}"] do
  204   ┆ ┆ if entry.done? do
  205   ┆ ┆ ┆ section = Map.get(sections, section_name)
  206   ┆ ┆ ┆ # Moves file as per the example in the documentation
  207   ┆ ┆ ┆ file_name = ReportLive.move_entry(socket, entry)
  208   ┆ ┆ ┆ params = %{file_name: file_name}
  209   ┆ ┆ ┆ Reports.add_picture_to_section(section, params)
  210   
  211   ┆ ┆ ┆ {:noreply, socket}
  212   ┆ ┆ else
  213   ┆ ┆ ┆ {:noreply, socket}
  214   ┆ ┆ end
  215   ┆ else
  216   ┆ ┆ {:noreply, cancel_upload(socket, :report_pictures, "#{entry.ref}")}
  217   ┆ end
  218   end
  219   

In the server log output I see a lot of channels being opened and then the uploads just disappear. My guess is that the channels time out...

Expected behavior

All the uploaded files are correctly uploaded without timing out.

chrismccord commented 3 years ago

aside from max_entries: 10000, being an extreme upper bound that we haven't considered, I think master will help a bit here with the message overhead since we now send less progress events. As written, we don't chunk out the uploads, so 10_000 entries would allow 10_000 simultaneous upload attempt, which of course you don't want :).

We can also send fewer progress events for large selections, but before we make any extra optimizations please give master a shot and let us know if things change w/ the existing files you are testing with. Thanks!

Zurga commented 3 years ago

Thanks for the quick reply! I agree that 10000 is an extreme upper bound that I will probably not use also. But 50 should work.

I will try to give master a shot, but Surface and LiveView master are conflicting as they both supply ~H

chrismccord commented 3 years ago

0.15.6 is out, which does not include the ~H changes. Please give that a shot! Thanks!

Zurga commented 3 years ago

After updating to 0.15.6, I get the following error in the browser console:

Uncaught Error: Cannot find module 'morphdom'
    webpackMissingModule phoenix_live_view.js:1
    js phoenix_live_view.js:1
    Webpack 7
phoenix_live_view.js:1

I tried to fix it with npm clean install, but that did not help.

chrismccord commented 3 years ago

I just tested with both:

"phoenix_live_view": "file:../deps/phoenix_live_view",

and

"phoenix_live_view": "^0.15.6",

and both work as expected. Can you rm -rf node_modules and try again? Are you sure you ran npm install within the assets directory, or npm install --prefix assets at your project root? Lastly, double check what your package-lock.json has for phoenix_live_view.js.

chrismccord commented 3 years ago

Ignore me – I was able to replicate it. Let me investigate

Zurga commented 3 years ago

Ok! Good luck!

I started playing around with the max_chunk_size and the chunk_timeout settings and larger amounts of files are coming in. This issue is no longer that pressing for me, I think I can make it work for now.

Maybe a setting where the maximum of concurrent channel that are opened at the same time could be useful to be able to throttle uploads with a high number of items.

Something like this:

allow_upload(:test, max_concurrent: 16)
chrismccord commented 3 years ago

webpack build fixed with 0.15.7 release. Please give it a try!

chrismccord commented 3 years ago

Maybe a setting where the maximum of concurrent channel that are opened at the same time could be useful to be able to throttle uploads with a high number of items

We'd consider a PR that adds this, but it is non trivial. Keep in mind you also need to ensure your server is able to handle the IO and CPU required for the file processing when allowing a very large number of files. Please give 0.15.7 a shot with the default options you had before and let us know how it looks. Thanks!

Zurga commented 3 years ago

0.15.7 did not fix the problem...

Unfortunately a big part of the page also froze and did not render the previews. This is with 100 files uploading. I started working on the client side chunking of the entries. Perhaps the setting for concurrency can also be used to limit the amount of concurrency when calling the function given to consume_uploaded_entries.

What command do you use to generate the minified code in the priv/static folder?

chrismccord commented 3 years ago

Did it solve the original issue for 5-10 files, or still locked up the LV?

Zurga commented 3 years ago

Sorry for the miscommunication that could have arisen, but originally I did not have issues with 5-10 files, only with 50-100.

If I understand correctly, an UploadChannel is created for each entry and uploading. With many files, this happens at the same time. I think that this creates a bottleneck.

Chunking the entries to spread out the load to a manageable amount would solve this. This would also mean that the uploading to a temporary folder, by the LiveView.UploadChannel, does not exceed available resources.

LiveView.Upload.consume_entries could then use Tasks to batch the execution of the provided function on the finished entry.

Please correct me where I am wrong.

Zurga commented 3 years ago

I solved this issue for my own project by maintaining a fork, but that is not practical for a number of reasons.

Figuring that implementing batched upload can be done using an external uploader, I don't feel the need that much to push these changes into Phoenix LiveView itself. It would be nice if the EntryUploader class is exported so that it can be used to implement custom uploading strategies.

Would this be something that could work?

Zurga commented 3 years ago

Although I now realize that I have to implement the channel_preflight function in my external function also...

Zurga commented 3 years ago

I realized I only have to create the token to sign the entry with, which was easy enough.

luka-TU commented 1 year ago

@chrismccord hi! I noticed that a related PR was closed: https://github.com/phoenixframework/phoenix_live_view/pull/1659 Does that mean that there is some other recommended way of achieving it? Thank you!

jeregrine commented 9 months ago

https://fly.io/phoenix-files/phoenix-liveview-zipped-uploads/ Is an approach @chrismccord has shared in the past.

chrismccord commented 9 months ago

Closing this as there are options for folks (like zipping an archive on the client and decompressing on server). Handling hundreds of files gets tricky because you'll need to stage the uploads to avoid overwhelming the the client, network, server, etc. It's something we could handle, but it brings a lot of complexity and not something we can prioritize at the moment. If someone puts together a robust PR, we'd be happy to review. I know Zurga has done some work here, but there were shortcomings from the DoS protection side, so we'd need to ensure we're properly covered. Thanks!