tus / TUSKit

The tus client for iOS.
https://tus.io/
MIT License
209 stars 114 forks source link

couldNotUploadFile Error with status code 409: Upload-Offset conflict #162

Open MilesV64 opened 1 year ago

MilesV64 commented 1 year ago

I'm using TUSKit with Supabase storage (an S3 wrapper), and for the most part it's working fine except the case where I try to upload a large file (~1gb) while repeatedly backgrounding and foregrounding the app on my phone, either by swiping to the Home Screen or locking my phone. It's inconsistent, but after a few times doing that I get a couldNotUploadFile error with an underlying 409 status code error, and digging into the upload method in TUSAPI and printing the raw response data, it says "Upload-Offset conflict".

How can I debug this? Is it an issue with my usage of TUS, the TUSKit iOS SDK itself, or is it possible this is a problem with Supabase's TUS protocol implementation?

Here's the full error:

uploadFailed: couldNotUploadFile(underlyingError: TUSKit.TUSAPIError.failedRequest(<NSHTTPURLResponse: 0x28347f300> { URL: <url> } { Status Code: 409, Headers {
    "Access-Control-Allow-Origin" =     (
        "*"
    );
    "Alt-Svc" =     (
        "h3=\":443\"; ma=86400"
    );
    "Content-Length" =     (
        23
    );
    Date =     (
        "Wed, 31 May 2023 16:14:12 GMT"
    );
    Server =     (
        cloudflare
    );
    "Strict-Transport-Security" =     (
        "max-age=2592000; includeSubDomains"
    );
    Vary =     (
        "Accept-Encoding"
    );
    "access-control-expose-headers" =     (
        "Authorization, Content-Type, Location, Tus-Extension, Tus-Max-Size, Tus-Resumable, Tus-Version, Upload-Concat, Upload-Defer-Length, Upload-Length, Upload-Metadata, Upload-Offset, X-HTTP-Method-Override, X-Requested-With, X-Forwarded-Host, X-Forwarded-Proto, Forwarded, Upload-Expires"
    );
    "cf-cache-status" =     (
        DYNAMIC
    );
    "cf-ray" =     (
        "7d0082268c553344-EWR"
    );
    "sb-gateway-mode" =     (
        direct
    );
    "sb-gateway-version" =     (
        1
    );
    "tus-resumable" =     (
        "1.0.0"
    );
} }))

Here's how I'm using TUS:

let tusEndpoint = Environment.SUPABASE_URL.appendingPathComponent("storage/v1/upload/resumable")

self.tusClient = try TUSClient(
    server: tusEndpoint,
    sessionIdentifier: "media-upload-queue",
    chunkSize: 6 * 1024 * 1024
)
let context: [String : String] = {
  var context: [String : String] = [:]

  if let mimeType = self.mimeType(forFile: fileURL) {
      context["contentType"] = mimeType
  }

  context["bucketName"] = <bucketName
  context["objectName"] = <fileName>

  return context
}()

let headers: [String : String] = {
  var headers: [String : String] = [:]

  headers["authorization"] = "Bearer <token>"
  headers["x-upsert"] = "true"

  return headers
}()

try self.tusClient.uploadFileAt(
  filePath: fileURL,
  customHeaders: headers,
  context: context
)
MilesV64 commented 1 year ago

Actually re-reading this issue seems to imply that this is the same problem just coming at it from a different angle - I don't really care about actively uploading in the background, but I imagine what's happening is it think it gets a network error and then tries to upload bytes the server already knows about? I guess I imagined that the server should just accept overwriting bytes it already has to avoid cases like this

MilesV64 commented 1 year ago

Well that made me think "do network errors get handled at all" so I tried putting my phone in airplane mode and the upload immediately failed which is really undesirable behavior -- shouldn't the upload just pause until internet is restored?

MilesV64 commented 1 year ago

According to the spec:

The tus protocol is built upon the principles of simple pausing and resuming. To pause an upload, you are allowed to end the current open request. The server will store the uploaded data as long as no violations against other constraints (e.g., checksums) or internal errors occur. Once you are ready to resume an upload, send a HEAD request to the corresponding upload URL to obtain the available offset. After receiving a valid response, you can upload more data using PATCH requests.

So I wonder if the issue is that after a PATCH failing due to a network error (IE backgrounding or dropping connection), the library is not HEADing the url before continuing to PATCH?

donnywals commented 1 year ago

Hi Miles, there's a lot here so let me respond to some of your key points first before addressing the main question in your issue.

Well that made me think "do network errors get handled at all" so I tried putting my phone in airplane mode and the upload immediately failed which is really undesirable behavior -- shouldn't the upload just pause until internet is restored?

This is something we could consider but we'd be at the mercy of the OS as to when exactly the upload would actually start. The upload would actually be in a "pending" state for about 24 hours before the OS considers the request impossible to make which might be too long for now.

If you want to have this behavior where a restored connection starts the upload it might be best if you implemented this yourself since this isn't something we have on the roadmap right now and my initial thought is that we will not be providing this since it would most likely be non-trivial to allow users to opt-out or tweak this behavior to their needs.

So I wonder if the issue is that after a PATCH failing due to a network error (IE backgrounding or dropping connection), the library is not HEADing the url before continuing to PATCH?

Possibly, I'll need to debug this scenario

Now for the actual issue at hand When you do a bunch of for- and backgrounding of the app the request should be cancelled after a second or so in the background since iOS would suspend the app entirely, including any pending network requests. In our upcoming release we allow requests to continue uploading in the background which should make your problem a non-issue.

However I will look into what's happening in the current SDK. It kind of sounds like it starts tripping over itself which of course isn't desirable; we want TUSKit to be robust and reliable. So I don't have any answers for you right now but I would like to ask you to see if a build with dw/bg-urlsession yields functioning results so we can at least verify that our next update would mitigate the issue at hand albeit by accident.

I will come back to this issue once I've done some testing on our current SDK to see what's happening exactly

MilesV64 commented 1 year ago

Hey thanks a lot for the response, and yea sorry the issue covered a bunch of different topics as things came to me! Appreciate you addressing everything.

For network drops that makes sense to handle individually, as long as retrying doesn't cause the offset error. I can try to make some logic there and see if I do run into the offset error.

The background stuff looks exciting, I'll check it out! I suppose another solution for my particular use-case is simple pausing and resuming the upload when back/foregrounding, which is easy and makes sense to do myself rather than adding that to the library.

Acconut commented 1 year ago

I guess I imagined that the server should just accept overwriting bytes it already has to avoid cases like this

No, tus servers never overwrite data. The client must always continue the upload where it was left off.

I tried putting my phone in airplane mode and the upload immediately failed which is really undesirable behavior -- shouldn't the upload just pause until internet is restored?

I don't think this behavior is desired by every developer, so TUSKit should not behave like this by default. This should rather be left to implement by the developers themselves.

So I wonder if the issue is that after a PATCH failing due to a network error (IE backgrounding or dropping connection), the library is not HEADing the url before continuing to PATCH?

You are right. TUSKit should send a HEAD request before continuing to PATCH if it resumes an upload after a network interruption. @donnywals Do you think that there are cases where this does not happen? Do we need to look into this?