skymethod / denoflare

Develop, test, and deploy Cloudflare Workers with Deno.
https://denoflare.dev
MIT License
692 stars 33 forks source link

Etag & CORS issues in public buckets #22

Closed defido closed 2 years ago

defido commented 2 years ago

Hi there, the public read has a couple of issues.

  1. The second time you try to download a file you will recieve this error, the string will be different. I have tried this with multiple files. Status Code: 500 TypeError: Conditional ETag should not be wrapped in quotes ("d03c2f258e9fc6e7236f61c130ed1e48").

  2. Range files, specifically m3u8 video files don't work.

Using this: denoflare push https://raw.githubusercontent.com/skymethod/denoflare/v0.5.4/examples/r2-public-read-worker/worker.ts

defido commented 2 years ago

Here is a working script that works with m3u8 video files & range

interface Env {
  KEMAL: KEMAL
}

function parseRange(encoded: string | null): undefined | { offset: number, length: number } {
  if (encoded === null) {
    return
  }

  const parts = encoded.split("bytes=")[1]?.split("-") ?? []
  if (parts.length !== 2) {
    throw new Error('Not supported to skip specifying the beginning/ending byte at this time')
  }

  return {
    offset: Number(parts[0]),
    length: Number(parts[1]) + 1 - Number(parts[0]),
  }
}

function objectNotFound(objectName: string): Response {
  return new Response(`<html><body>R2 object "<b>${objectName}</b>" not found</body></html>`, {
    status: 404,
    headers: {
      'content-type': 'text/html; charset=UTF-8'
    }
  })
}

export default {
  async fetch(request: Request, env: Env, context: any): Promise<Response> {
    const url = new URL(request.url)
    const objectName = url.pathname.slice(1)

    console.log(`${request.method} object ${objectName}: ${request.url}`)

    if (request.method === 'GET' || request.method === 'HEAD') {
      if (objectName === '') {
        if (request.method == 'HEAD') {
          return new Response(undefined, { status: 400 })
        }

        const options: R2ListOptions = {
          prefix: url.searchParams.get('prefix') ?? undefined,
          delimiter: url.searchParams.get('delimiter') ?? undefined,
          cursor: url.searchParams.get('cursor') ?? undefined,
          include: ['customMetadata', 'httpMetadata'],
        }
        console.log(JSON.stringify(options))

        const listing = await env.KEMAL.list(options)
        return new Response(JSON.stringify(listing), {headers: {
          'content-type': 'application/json; charset=UTF-8',
        }})
      }

      if (request.method === 'GET') {
        const range = parseRange(request.headers.get('range'))
        const object = await env.KEMAL.get(objectName, {
          range,
          onlyIf: request.headers,
        })

        if (object === null) {
          return objectNotFound(objectName)
        }

        // Construct the cache key from the cache URL
        // const cacheKey = new Request(url.toString(), request);
        // const cache = caches.default;

        // Check whether the value is already available in the cache
        // if not, you will need to fetch it from R2, and store it in the cache
        // for future access
        // let response = await cache.match(cacheKey);
        // context.waitUntil(cache.put(cacheKey, response.clone()));

        const headers = new Headers()
        const status = object.body ? (range ? 206 : 200) : 304
        object.writeHttpMetadata(headers)
        headers.set('etag', object.httpEtag)
        headers.set('Access-Control-Allow-Origin', '*')
        headers.set('Cache-Control', 's-max-age=3600');
        const response = new Response(object.body, {
          headers,
          status
        })
        return response
      }

      const object = await env.KEMAL.head(objectName, {
        onlyIf: request.headers,
      })

      if (object === null) {
        return objectNotFound(objectName)
      }

      const headers = new Headers()
      object.writeHttpMetadata(headers)
      headers.set('etag', object.httpEtag)
      return new Response(null, {
        headers,
      })
    }
    if (request.method === 'PUT' || request.method == 'POST') {
      const object = await env.KEMAL.put(objectName, request.body, {
        httpMetadata: request.headers,
      })
      return new Response(null, {
        headers: {
          'etag': object.httpEtag,
        }
      })
    }
    if (request.method === 'DELETE') {
      await env.KEMAL.delete(url.pathname.slice(1))
      return new Response()
    }

    return new Response(`Unsupported method`, {
      status: 400
    })
  }
}
johnspurlock-skymethod commented 2 years ago

What is the http client in your use case?

m3u8 files do not typically need range requests, do you mean the underlying .ts files?

Any way you can share what your R2 bucket objects look like or how they were uploaded?

I think there is an issue in the v0.5.4 etag handling, but Range-requests work for me.

johnspurlock-skymethod commented 2 years ago

Ok fixed the etag handling in commit 46fb4554c2d00297d24bfe48a2eff83b0f29c598, try it out on your side and see if it works in your client setup.

To use the version as of that commit, replace the release tag with the commit hash: denoflare push https://raw.githubusercontent.com/skymethod/denoflare/46fb4554c2d00297d24bfe48a2eff83b0f29c598/examples/r2-public-read-worker/worker.ts

I tested this myself using an hls generated with: ffmpeg -i input.mp4 -hls_playlist_type vod out.m3u8, and using rclone to upload to an R2 bucket, then testing the m3u8 with iina, quicktime, and also the input mp4 as well and the browser. Range requests work fine for me.

defido commented 2 years ago

Ok fixed the etag handling in commit

Yes that seems to be fixed. The caching solution used is https://github.com/kotx/render and it works. The range video files I had tested extensively and it doesn't work still with denoflare afaik.

If it fails you will see: A network error occurred: manifestLoadError

I test the range here: https://hls-js.netlify.app/

johnspurlock-skymethod commented 2 years ago

Ah, I think I see what you're running into: CORS

Take a look at the Error box in that test page and: This might be a CORS issue, which is indeed the case by confirming in dev tools.

I didn't run into this yet, since I'm hosting the content first-party, so CORS is not applicable.

Note your examples hardcode the most permissive CORS policy: headers.set('Access-Control-Allow-Origin', '*')

It's tricky, because a generic public read worker like this denoflare example should not make everything available via CORS by default.

S3 has a way to configure with configuration: https://docs.aws.amazon.com/AmazonS3/latest/userguide/ManageCorsUsing.html

Hopefully R2's native public read functionality will have something similar.

In the meantime, I guess I'll have to add some way of configuring this using another custom env binding.

defido commented 2 years ago

Ah, I think I see what you're running into: CORS

You are correct. I just changed the cors and it works now. Hopefully R2 will add configs as you said. Right now it's a little finicky with caching, public buckets etc.

function computeHeaders(obj, range) {
    const headers = new Headers();
    obj.writeHttpMetadata(headers);
    console.log('writeHttpMetadata', [
        ...headers
    ].map((v)=>v.join(': ')
    ).join(', '));
    headers.set('content-length', String(obj.size));
    headers.set('etag', obj.httpEtag);
    headers.set('last-modified', obj.uploaded.toUTCString());
    headers.set('Access-Control-Allow-Origin', '*'); <----- HERE
    if (range) headers.set('content-range', computeContentRange(range, obj.size));
    return headers;
}
johnspurlock-skymethod commented 2 years ago

Update: Doesn't sound like R2 public buckets will have CORS config at launch - so this will indeed still be useful to add to this example in the meantime.

I'll add something before cutting the next minor Denoflare release.

defido commented 2 years ago

Awesome, it's quite easy to edit atm, if people really need it. It's just a one line operation.

Screen Shot 2022-07-23 at 11 26 02 am
johnspurlock-skymethod commented 2 years ago

Included in the latest denoflare release v0.5.6.

Added two new optional CORS config vars to the public read example worker and updated the docs: https://denoflare.dev/examples/r2-public-read#configuration

The examples on that page also now show how to provide the most basic and permissive CORS access (Access-Control-Allow-Origin: *)

defido commented 2 years ago

Nice!

augustblack commented 2 years ago

46fb455,

Are you sure the fix here is correct?

I'm running into an issue where I fetch an R2 image first and it works.

If I reload the image , I get an error:

Error: Unexpected status 400, code=InvalidArgument, message=Invalid Argument: If-None-Match each ETag must be surrounded by double quotes

If I remove the stripEtagQuoting it works :

    const etagDoesNotMatch = ifNoneMatch
      ? ifNoneMatch                        
      : undefined 

Strangely, the issue only seems to happen in Chrome, and when running locally.

augustblack commented 2 years ago

actually, I can also get it to report the same error with Firefox and a longer mp3 file.
I'm guessing it must be some weird combination of 'range' and 'If-None-Match'

johnspurlock-skymethod commented 2 years ago

I'll take a look, the requirements around etags have already changed a few times on Cloudflare's side, I'll retest.

You're saying it works fine when running on cf, but not when running denoflare serve locally?

I'm assuming you're using the latest version of the public read worker?

johnspurlock-skymethod commented 2 years ago

I was able to reproduce quickly locally (denoflare serve), production seems fine.

Looks like they did indeed change etag validation requirements, but this time in their S3-implementation, not the worker bindings.

Just posted a fix in commit c051c53ff6f1dbad10d5fd4439e0f892c56572a3, can you verify it on your side?

To try this version, just substitute denoflare/v0.5.8 with denoflare/c051c53ff6f1dbad10d5fd4439e0f892c56572a3 in the instructions on the R2 public read guide page

augustblack commented 2 years ago

I still the same issue.

Output from the example worker in recent chrome using denoflare serve:

Bundled https://raw.githubusercontent.com/skymethod/denoflare/v0.5.8/cli-webworker/worker.ts (process) in 1472ms
runScript: https://raw.githubusercontent.com/skymethod/denoflare/c051c53ff6f1dbad10d5fd4439e0f892c56572a3/examples/r2-public-read-worker/worker.ts
Bundled https://raw.githubusercontent.com/skymethod/denoflare/c051c53ff6f1dbad10d5fd4439e0f892c56572a3/examples/r2-public-read-worker/worker.ts (process) in 17ms
worker: start
Started in 1786ms (isolation=isolate)
Local server running on http://localhost:3000
fetchExternalIp: Fetching...
fetchExternalIp: Determined to be 128.138.65.129 in 127ms
GET http://localhost:3000/images/711-pictograms.jpg
{"flags":[]}
request headers:
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9,de-AT;q=0.8,de;q=0.7,es-US;q=0.6,es;q=0.5
cache-control: max-age=0
cf-connecting-ip: 128.138.65.129
cf-ray: 8808ce8362916cca
connection: keep-alive
cookie: connect.sid=<blah>
dnt: 1
host: localhost:3000
if-modified-since: Sun, 04 Sep 2022 23:58:08 GMT
if-none-match: "197413f08652a6bc89f7f4181ead90d3"
sec-ch-ua: "Google Chrome";v="105", "Not)A;Brand";v="8", "Chromium";v="105"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "macOS"
sec-fetch-dest: document
sec-fetch-mode: navigate
sec-fetch-site: none
sec-fetch-user: ?1
upgrade-insecure-requests: 1
user-agent: <blah>
bucket.get images/711-pictograms.jpg {"onlyIf":{"etagDoesNotMatch":"197413f08652a6bc89f7f4181ead90d3","uploadedAfter":"2022-09-04T23:58:09.000Z"}}

error displayed in browser:

Error: Unexpected status 400, code=InvalidArgument, message=Invalid Argument: If-None-Match each ETag must be surrounded by double quotes, got '197413f08652a6bc89f7f4181ead90d3'.
    at throwIfUnexpectedStatus (https://raw.githubusercontent.com/skymethod/denoflare/v0.5.8/common/r2/r2.ts:147:11)
    at async getOrHeadObject (https://raw.githubusercontent.com/skymethod/denoflare/v0.5.8/common/r2/get_head_object.ts:36:5)
    at async getObject (https://raw.githubusercontent.com/skymethod/denoflare/v0.5.8/common/r2/get_head_object.ts:6:12)
    at async ApiR2Bucket.get (https://raw.githubusercontent.com/skymethod/denoflare/v0.5.8/cli/api_r2_bucket.ts:60:21)
    at async https://raw.githubusercontent.com/skymethod/denoflare/v0.5.8/common/rpc_r2_bucket.ts:39:34
    at async RpcChannel.receiveMessage (https://raw.githubusercontent.com/skymethod/denoflare/v0.5.8/common/rpc_channel.ts:41:36)
    at async Worker.worker.onmessage (https://raw.githubusercontent.com/skymethod/denoflare/v0.5.8/cli/worker_manager.ts:70:17)
johnspurlock-skymethod commented 2 years ago

Sorry, my fault - bad instructions. This fix was in denoflare itself (the way local buckets work), the public-read worker is unchanged.

So you can test locally by installing denoflare as of that commit, by substituting denoflare/v0.5.8 with denoflare/c051c53ff6f1dbad10d5fd4439e0f892c56572a3 in the installation page

augustblack commented 2 years ago

disco. looking good now.

johnspurlock-skymethod commented 2 years ago

Thanks - since this only affects local dev and there is a temporary workaround for it (installing to a commit), I won't cut an emergency release version for it, but it will go out in the next release