tus / tus-js-client

A pure JavaScript client for the tus resumable upload protocol
https://tus.io/
MIT License
2.12k stars 316 forks source link

How to avoid 409 Conflict errors, and avoiding duplicating fingerprints. #730

Closed mkabatek closed 1 month ago

mkabatek commented 1 month ago

Working in an electron version 31 app with React, I am seeing two behaviors which I do not know how to address. The uploads seem to pause and resume fine when triggering them manually, however when doing things like navigating away from the page, or force closing the application, then trying to resume I see two behaviors:

  1. A new fingerprint get generated. How can I force resuming an upload using an existing fingerprint?
  2. If a .info file exist on the file store/s3/server but no file has made it into the upload, I am usually seeing a 409 error. How can I address this by either fetching the offset manually? or some other way?

Below is the example of how I am starting the uploads, and resuming them.

Code to upload a file / create an upload

// Function to handle the upload of a single file
  const uploadFile = async (file: File): Promise<void> => {
    return new Promise((resolve, reject) => {
      if (uploadProgress[file.name].percentage === 100) {
        console.log(`File ${file.name} is already fully uploaded, skipping.`);
        resolve(); // Resolve immediately since the file is already uploaded
        return;
      }

      console.log(uploadProgress[file.name].uploadUrl);
      const upload = new tus.Upload(file, {
        endpoint: TUS_ENDPOINT,
        chunkSize: 5242880, // 5MB chunk size
        removeFingerprintOnSuccess: true,
        metadata: {
          filename: file.name,
          filetype: file.type,
          key: `${account && account.id}/scan/${scan.id}/file/${file.name}`,
          ...(account?.id && { accountId: account.id }),
        },
        headers: {
          "x-upsert": "false", // optionally set upsert to true to overwrite existing files
        },
        onProgress: (bytesUploaded, bytesTotal) => {
          const percentage = ((bytesUploaded / bytesTotal) * 100).toFixed(2);
          console.log(bytesUploaded, bytesTotal, percentage + "%");
          setUploadProgress((prevProgress) => ({
            ...prevProgress,
            [file.name]: {
              percentage: Number(percentage),
              bytesUploaded: bytesUploaded,
              bytesTotal: bytesTotal,
              uploadUrl: upload.url,
            }, // Update progress for the specific file
          }));
        },
        onSuccess: () => {
          console.log(
            "Download %s from %s",
            file.name,
            uploadProgress[file.name]?.uploadUrl
          );
          setUploads((prevUploads) => {
            const newUploads = new Map(prevUploads);
            newUploads.delete(file.name);
            return newUploads;
          });
          resolve(); // Resolve the promise when the upload is successful
        },
        onError: (error) => {
          console.log(JSON.stringify(error.message));
          reject(error); // Reject the promise on error
        },
        onShouldRetry: (
          error: tus.DetailedError,
          retryAttempt: number,
          options: tus.UploadOptions
        ) => {
          console.log("Retry attempt:", retryAttempt);
          console.log("Upload options:", options);

          // Get the underlying XMLHttpRequest object
          let xhrObject = error.originalRequest.getUnderlyingObject();
          const status = xhrObject?.status;

          // Check for a 409 Conflict status
          if (status === 409) {
            console.log(
              "Handling 409 Conflict error. Attempting to correct the upload offset..."
            );
            return true;
          }
          return false; // Return false for other errors or if 409 is not handled
        },
      });

      console.log(upload.url);
      setUploads((prevUploads) => new Map(prevUploads).set(file.name, upload));
      upload.start();
    });
  };

Code to resume an upload

  const handleResume = async (file: File) => {
    // Check if the file was previously paused
    const pausedUpload = pausedUploads.get(file.name);

    if (pausedUpload) {
      // Resume the upload
      console.log(pausedUpload);
      //await sendHeadRequest(pausedUpload.url);
      pausedUpload.start();
      setUploads((prevUploads) =>
        new Map(prevUploads).set(file.name, pausedUpload)
      );
      setPausedUploads((prevPausedUploads) => {
        const newPausedUploads = new Map(prevPausedUploads);
        newPausedUploads.delete(file.name);
        return newPausedUploads;
      });
    } else {
      // If not paused, start a new upload
      await uploadFile(file);
    }
  };

Is there a proper way to resume the uploads so they don't generate a new fingerprint in the case where the app is force quit or the page in navigated away from? I have tried storing a custom fingerprint, however the same behavior occurs I see a new fingerprint with a new stamp at the end of it.

Edit: Below are some state variables for context:

  const [acceptedFiles, setAcceptedFiles] = useState<File[]>([]);
  const [acceptedFilesData, setAcceptedFilesData] = useState<any[]>([]);
  const [isUploading, setIsUploading] = useState(false); // Track if uploads are in progress
  const [isPaused, setIsPaused] = useState(false); // Track if uploads are paused
  const [uploads, setUploads] = useState<Map<string, tus.Upload>>(new Map());
  const [uploadProgress, setUploadProgress] = useState<UploadProgress>({});
  const [pausedUploads, setPausedUploads] = useState<Map<string, tus.Upload>>(
    new Map()
Acconut commented 1 month ago
  1. A new fingerprint get generated. How can I force resuming an upload using an existing fingerprint?

If the fingerprint changes, this means that any of its input values changed. Have a look at the default fingerprint to see which values are used for the fingerprint: https://github.com/tus/tus-js-client/blob/main/lib/browser/fileSignature.js

I haven't used Electron, so maybe we need to adjust the fingerprint for it. Mobile OS like to provide simulated paths to application that are only valid within the application and maybe Electron is doing something similar.

That being said, you can also supply your own fingerprint method if that works better for your specific use case.

2. If a .info file exist on the file store/s3/server but no file has made it into the upload, I am usually seeing a 409 error. How can I address this by either fetching the offset manually? or some other way?

That should not happen since tus-js-client fetches the upload offset using a HEAD request before resuming with a PATCH request. Does that not happen for you? What tus server are you using?

mkabatek commented 1 month ago

Mobile OS like to provide simulated paths to application that are only valid within the application and maybe Electron is doing something similar.

Electron is basically node.js running in a "main process" and chrome running as a "render process", and they can communicate with each other, so yes when creating Files in the render process the path are usually omitted in the File object, however, I have a process that fetches and reconstructs the Files from paths stored on the disk from the main process.

Looking at the fileSignature.js the following is how the default fingerprints are constructed:

 ['tus-br', file.name, file.type, file.size, file.lastModified, options.endpoint].join('-'),

I noticed that when reconstructing my files I was using new Date.now() as lastModified. I have adjusted that and still see the same behavior. Below is an example of two fingerprints from my application, the only thing I see that is different is after the :: at the end of the key. I also notice that the creationTime of the two files is different as well.

Full fingerprints below:

tus::tus-br-PXL_20240928_050544563.jpg-image/jpeg-4233017-1728320034987-http://localhost:3000/files::247923445207   {"size":4233017,"metadata":{"filename":"PXL_20240928_050544563.jpg","filetype":"image/jpeg","key":"06a737a5-76f3-4336-8d6f-e359d2d4d141/scan/fecce503-d9e5-401b-8bcd-455328f12d8e/file/PXL_20240928_050544563.jpg","accountId":"06a737a5-76f3-4336-8d6f-e359d2d4d141"},"creationTime":"Mon Oct 07 2024 10:55:00 GMT-0600 (Mountain Daylight Time)","uploadUrl":"http://localhost:3000/files/MDZhNzM3YTUtNzZmMy00MzM2LThkNmYtZTM1OWQyZDRkMTQxL3NjYW4vZmVjY2U1MDMtZDllNS00MDFiLThiY2QtNDU1MzI4ZjEyZDhlL2ZpbGUvUFhMXzIwMjQwOTI4XzA1MDU0NDU2My5qcGc"}   

tus::tus-br-PXL_20240928_050544563.jpg-image/jpeg-4233017-1728320034987-http://localhost:3000/files::41150514334    {"size":4233017,"metadata":{"filename":"PXL_20240928_050544563.jpg","filetype":"image/jpeg","key":"06a737a5-76f3-4336-8d6f-e359d2d4d141/scan/fecce503-d9e5-401b-8bcd-455328f12d8e/file/PXL_20240928_050544563.jpg","accountId":"06a737a5-76f3-4336-8d6f-e359d2d4d141"},"creationTime":"Mon Oct 07 2024 10:55:12 GMT-0600 (Mountain Daylight Time)","uploadUrl":"http://localhost:3000/files/MDZhNzM3YTUtNzZmMy00MzM2LThkNmYtZTM1OWQyZDRkMTQxL3NjYW4vZmVjY2U1MDMtZDllNS00MDFiLThiY2QtNDU1MzI4ZjEyZDhlL2ZpbGUvUFhMXzIwMjQwOTI4XzA1MDU0NDU2My5qcGc"}   

Just key differences the number at the end, 247923445207 versus 41150514334:

tus::tus-br-PXL_20240928_050544563.jpg-image/jpeg-4233017-1728320034987-http://localhost:3000/files::247923445207

tus::tus-br-PXL_20240928_050544563.jpg-image/jpeg-4233017-1728320034987-http://localhost:3000/files::41150514334

Just value differences creationTime:

"creationTime":"Mon Oct 07 2024 10:55:00 GMT-0600 (Mountain Daylight Time)"

"creationTime":"Mon Oct 07 2024 10:55:12 GMT-0600 (Mountain Daylight Time)"

I'm a little unsure if the creationTime value come into play in generating the fingerprint since the File objects need to be reconstructed if the user leaves the page and it gets re-rendered.

That should not happen since tus-js-client fetches the upload offset using a HEAD request before resuming with a PATCH request. Does that not happen for you? What tus server are you using?

I am curious if the 409 issue is a byproduct of the fingerprint mismatch, I am using slightly modified version of the node reference implementation in order to handle AWS S3 keys, and connect/read/write with a backend database. However the main difference is just using a custom naming function and URL generation:

    // Initialize TUS Server
    this.server = new Server({
      path: '/files', // URL endpoint where TUS will handle uploads
      datastore: s3Store,
      namingFunction,
      generateUrl(req, { proto, host, path, id }) {
        const isSecure = req.headers['x-forwarded-proto'] === 'https';
        const protocol = isSecure ? 'https' : 'http';
        id = Buffer.from(id, 'utf-8').toString('base64url');
        return `${protocol}://${host}${path}/${id}`;
      },
      getFileIdFromRequest(req, lastPath) {
        // lastPath is everything after the last `/`
        // If your custom URL is different, this might be undefined
        // and you need to extract the ID yourself
        return Buffer.from(lastPath, 'base64url').toString('utf-8');
      },
    });

I don't think this would affect the HEAD requests. I'm starting to think this particular issue is related to the fingerprint not being regenerated, because otherwise if I pause the file upload manually everything works as expected. It's only when the application/component is abruptly closed (which is a case I want to handle :) ).

mkabatek commented 1 month ago

@Acconut thanks for your help, and input. I think I found the issue and how to properly account for the creation date in the fingerprint. Here is a revised version of file resumption:

const handleResume = async (file: File) => {
    // Check if the file was previously paused
    const pausedUpload = pausedUploads.get(file.name);
    const progress = uploadProgress[file.name];

    if (
      pausedUpload ||
      (uploadProgress[file.name].percentage < 100 &&
        uploadProgress[file.name].percentage !== 0)
    ) {
      // Resume the upload
      if (pausedUpload) {
        const previousUpload = await pausedUpload?.findPreviousUploads();
        console.log(previousUpload);
        pausedUpload.resumeFromPreviousUpload(previousUpload[0]);
        pausedUpload.start();
        setUploads((prevUploads) =>
          new Map(prevUploads).set(file.name, pausedUpload)
        );
      }
      setPausedUploads((prevPausedUploads) => {
        const newPausedUploads = new Map(prevPausedUploads);
        newPausedUploads.delete(file.name);
        return newPausedUploads;
      });
    } else {
      // If not paused, start a new upload
      await uploadFile(file);
    }
  };

The key difference here is after reconstructing the file with the proper lastModified, then reconstructing the tus.Upload object using the file, we need to call findPreviousUploads, and pass the result to resumeFromPreviousUpload, otherwise a new fingerprint gets created. Using this method we previously created fingerprint gets used and the upload picks up where it left off.

        const previousUpload = await pausedUpload?.findPreviousUploads();
        console.log(previousUpload);
        pausedUpload.resumeFromPreviousUpload(previousUpload[0]);
        pausedUpload.start();

I have other issues to make this perfect but I think they are unrelated, I also have not seen any 409 Conflicts since. More testing is required on my part.

Acconut commented 1 month ago

The key difference here is after reconstructing the file with the proper lastModified, then reconstructing the tus.Upload object using the file, we need to call findPreviousUploads, and pass the result to resumeFromPreviousUpload, otherwise a new fingerprint gets created. Using this method we previously created fingerprint gets used and the upload picks up where it left off.

Yes, that is correct. Passing the same (or a similar) file object to tus.Upload is not enough. You need to explicitly call findPreviousUploads and resumeFromPreviousUpload if you want to search previously started uploads and resume from one of them. I am glad you found this in your own :)

I will close this issue for now since it seems to be working for you. Feel free to reopen it if the problem did not disappear.