jw-12138 / r2-uploader

Web Interface for Cloudflare R2
https://r2.jw1.dev
MIT License
124 stars 41 forks source link

net::ERR_FAILED 413 (Payload Too Large) #4

Closed spurin closed 2 months ago

spurin commented 3 months ago

Hi @jw-12138,

Appreciate your efforts with the r2-uploader, I love it!

Sadly, there appears to be an issue with larger files. I'm attempting to upload a file 240.3MB (227,084,287 bytes). It hangs on the status bar and after some time, in the console I see net::ERR_FAILED 413 (Payload Too Large). Have attempted to use this in different browsers also (Chrome and Safari, same issue).

The file works through the Cloudflare uploader.

Possibly something has changed on the Cloudflare side as I see in the readme, references to this supporting files > 300MB.

Many Thanks

James

jw-12138 commented 3 months ago

this is indeed a problem, cloudflare workers have a strict body size limit around 100mb (for both free and pro plan), meaning any file that's over this limit will trigger 413 error.

i'll see what i can do!

jw-12138 commented 2 months ago

made some digging, unfortunately there are no easy ways to solve this problem, and ironically i said the Cloudflare dashboard could only upload files smaller than 300MB, which is not ideal for large files in the README 🤣.

initially i'm thinking using the browser to chop the big file into chunks and upload them into r2 with multiple api calls, and then call another api to merge the chunks, but workers can't handle a buffer which size is over 128Mb (workers' maximum memory size).

image

i think i'll make another API just for the merge action, but this will require a server to run it.

what do you think? @spurin

spurin commented 2 months ago

If you were to go down the server route on this, I can help create a Dockerfile to allow people to run the server locally and conveniently as a single Docker command.

We could also look at taking this further as a Docker Desktop extension, allowing people to install this via the Docker Desktop marketplace for convenience as a single click.

They'd have their own private server from the Docker Desktop UI.

I've created a number of Docker Desktop extensions and published them so this bit, I can volunteer if it helps?

jw-12138 commented 2 months ago

appreciate the will to help, after some drafting i found this might not be a good idea, people are downloading the files which they just uploaded, and it's making the pattern more complex. i need some time to think this through...

image image
jw-12138 commented 2 months ago

and i made a mistake, r2 can only be accessed via a worker or the s3-like api, so a server could never put the merged file directly into r2, meaning s3-like api is the only option left🥲, which is not what i prefer, cause there're already a lot of (base-on-s3) tools out there...

i'm sorry to let you know that this issue might get tagged with won't fix

spurin commented 2 months ago

It looks like there may be a way to perform multipart uploads, using the r2 worker api. The example is in python but I think this approach could possibly be implemented into the uploadFile function - https://developers.cloudflare.com/r2/api/workers/workers-multipart-usage/#perform-a-multipart-upload-with-your-worker-optional

spurin commented 2 months ago

I've experimented with GPT, referencing your existing upload function with a request to refactor, based on the approach from the r2 documented python multipart guide.

Here's the output in case any of this is re-usable in this exploration:

async function uploadFile(file) {
  const endPoint = localStorage.getItem('endPoint');
  const apiKey = localStorage.getItem('apiKey');

  if (!endPoint || !apiKey) {
    alert('Please set an endpoint and api key first.');
    return;
  }

  const file_key = renameFileWithRandomId.value ? file.id_key : file.key;
  const fileName = (endPoint[endPoint.length - 1] === '/') ? formatFileName(file_key) : '/' + formatFileName(file_key);
  const url = endPoint + fileName;
  const partSize = 10 * 1024 * 1024; // 10MB, Cloudflare R2's minimum part size except for the last part

  try {
    // Step 1: Create the multipart upload and retrieve the upload ID
    const createResponse = await axios.post(url, null, {
      params: { action: 'mpu-create' },
      headers: { 'x-api-key': apiKey }
    });
    const uploadId = createResponse.data.uploadId;

    // Step 2: Calculate the number of parts and prepare parts metadata
    const totalParts = Math.ceil(file.size / partSize);
    const parts = [];

    for (let i = 0; i < totalParts; i++) {
      const start = i * partSize;
      const end = Math.min(start + partSize, file.size);
      const blobPart = file.slice(start, end);

      // Upload each part
      const partResponse = await axios.put(url, blobPart, {
        params: {
          action: 'mpu-uploadpart',
          uploadId: uploadId,
          partNumber: i + 1
        },
        headers: {
          'x-api-key': apiKey,
          'Content-Type': file.type
        }
      });
      parts.push({ ETag: partResponse.headers.etag, PartNumber: i + 1 });
    }

    // Step 3: Complete the multipart upload
    await axios.post(url, { parts }, {
      params: { action: 'mpu-complete', uploadId: uploadId },
      headers: { 'x-api-key': apiKey }
    });

    console.log('Upload complete');
  } catch (error) {
    console.error('Upload failed', error);
  }
}
jw-12138 commented 2 months ago

@spurin this is pretty much just what we talked before, multiple api calls to upload chunks, server side will have to handle the merge which is not possible with cloudflare workers.

now the best (also only) option is to integrate s3-like api with server-oriented code, and the deployment can be docker-ized just like you said, except that now the workers look like a joke...

i'll let you know when the new back-end is done, just don't expect me can do this in no time, kinda swamped recently (juggling a new job), but yeah, we are getting there!

spurin commented 2 months ago

Thanks so much for your detailed explanations. Initially I assumed that this functionality was native to the worker but, your diagrams helped.

After some experimentation today, I've been able to create a working proof of concept with the multipart functionality. I've raised a pull request with the changes here: https://github.com/jw-12138/r2-uploader/pull/5

Have also added guidance on how to run the project with Docker.

Hope this helps and thanks again for this project!

jw-12138 commented 2 months ago

bucket.createMultipartUpload()

bucket. resumeMultipartUpload()

how the hell did i miss these 2 functions🫠, thank you for the PR! suddenly i got the hope to do this lol, the workers script will actually be refactored using hono and typescript, and i've already made some progress:

image

things left to do is to impleament some other APIs to handle multipart uploading which you have already wrote most of the code! and for the client side i'm thinking, we would still use the old upload api for files that's less than 100 mb, use multipart uploading api only when file is bigger than 100mb!

should be done in this weekend!

jw-12138 commented 2 months ago

https://github.com/jw-12138/r2-uploader/assets/29943110/33f82f15-cda4-484c-bb3e-d1df617a0bb5

just couldn't wait, works like a charm.

changes have been pushed, README will be updated later!

jw-12138 commented 2 months ago

README updated

spurin commented 2 months ago

Love this so much!

Was dreading the thought of having to go down the route of S3 tools and this interface, is so slick! It's the icing on the cake to see this working 👏

Thank you! Also, very kind of you to reference me on the contributor side!

I've not dug into the code implementation as yet but something to think about, if not already done is the retry logic for the failure of individual chunks. I think that the Python implementation may have had this.

By the way, please feel free to close the issue/pr whenever you're ready.

jw-12138 commented 2 months ago

good point! i've added a retry mechanism, each chunk will now have 5 chances to retry the upload, the UI will tell the user that the upload have failed if any of the chunk's retry chances go to 0, and they can choose whether to re-upload the whole file, or just remove that file from the queue.