cloudflare / workers-sdk

⛅️ Home to Wrangler, the CLI for Cloudflare Workers®
https://developers.cloudflare.com/workers/
Apache License 2.0
2.7k stars 709 forks source link

🐛 BUG: Unable to put an object with a key containing '...' to R2 #3520

Open tkngaejcpi opened 1 year ago

tkngaejcpi commented 1 year ago

Which Cloudflare product(s) does this pertain to?

R2

What version of Wrangler are you using?

3.1.1

What operating system are you using?

Linux

Describe the Bug

After I exec wrangler r2 object put 'blog/assets/_...slug_.abcd1234.css' -f ./dist/assets/_...slug_.abcd1234.css, it returned Failed to fetch /accounts/*/r2/buckets/*/objects/assets/_...slug_.abcd1234.css - 403: Forbidden).

But when I changed the key into blog/assets/_.slug_.abcd1234.css, which has no consecutive '.', it just uploaded without any error.

I think it could be a bug of path handling because the original file (whose key contains three consecutive '.') could be uploaded in the R2 webpage.

RamIdeas commented 1 year ago

I think the interpolated strings (specifically objectName) in the endpoint pathname need to be escaped here https://github.com/cloudflare/workers-sdk/blob/c4e6f1565ac6ef38929c72d37ec27d158ec4f4ee/packages/wrangler/src/r2.ts#L111

dave-swift commented 1 year ago

I assume this is related, but a similar issue happens when a key contains a forwardslash (eg "log/2023-09-21/15:00.json").

The PUT is successful because I can see it in the sqlite database with the correct name. If I call SELECT * FROM _mf_objects, I get:

log/2023-09-21/15:00.json|67880c83ecd213cd99d85d84b741a73d079ab83996b6ae30250c85f36706615a0000018ab843be1d|b1547edcb91979d2dd5cd2c4916b26a8|1838611|ae7b047273edae8603a505929a60de79|1695308561994|{}|{}|{}

However, when I try to GET the file, it fails. Call: wrangler r2 object get mybucket/log/2023-09-21/15:00.json --local

⛅️ wrangler 3.9.0

Downloading "log/2023-09-21/15:00.json" from "mybucket". /usr/src/app/node_modules/wrangler/wrangler-dist/cli.js:30947 throw a; ^

Error: ENOENT: no such file or directory, open 'log/2023-09-21/15:00.json' Emitted 'error' event on WriteStream instance at: at emitErrorNT (node:internal/streams/destroy:151:8) at emitErrorCloseNT (node:internal/streams/destroy:116:3) at process.processTicksAndRejections (node:internal/process/task_queues:82:21) { errno: -2, code: 'ENOENT', syscall: 'open', path: 'log/2023-09-21/15:00.json' }

Node.js v18.18.0

JacobMGEvans commented 1 year ago

I think the interpolated strings (specifically objectName) in the endpoint pathname need to be escaped here

https://github.com/cloudflare/workers-sdk/blob/c4e6f1565ac6ef38929c72d37ec27d158ec4f4ee/packages/wrangler/src/r2.ts#L111

@RamIdeas The variables inside the ${} are automatically escaped, so there's no need to manually escape the interpolated strings. Could be we need to stringify the values to make sure variables don't contain any characters that are not allowed in a URL, we could just force it into a string.

petebacondarwin commented 10 months ago

I think the interpolated strings (specifically objectName) in the endpoint pathname need to be escaped here https://github.com/cloudflare/workers-sdk/blob/c4e6f1565ac6ef38929c72d37ec27d158ec4f4ee/packages/wrangler/src/r2.ts#L111

@RamIdeas The variables inside the ${} are automatically escaped, so there's no need to manually escape the interpolated strings. Could be we need to stringify the values to make sure variables don't contain any characters that are not allowed in a URL, we could just force it into a string.

I am pretty confident that there is no escaping going on here. The objectName is not escaped at the time it is passed to cfetch(), which in turn passes it unchanged into unidici's fetch(), which I believe will also not do any escaping?

That being said I don't know if escaping would be a full solution, unless our API is designed to unescape the string when it reads it in... we'd need to check. I guess the fact that it was possible to upload the key via the R2 webpage implies that escaping would be fine.

petebacondarwin commented 10 months ago

I also note that

encodeURIComponent("a...b")
> 'a...b'

So encoding the path component will make no difference.

petebacondarwin commented 10 months ago

OK so it looks like it is the Cloudflare firewall that is blocking URLs that contain .. within the path. Note that if you got to

https://api.cloudflare.com/client/v4/accounts/xxxx/r2/buckets/test/objects/_...slug_.abcd1234.css

you will see that the firewall has blocked the request.

If instead you remove the extra dots and go to

https://api.cloudflare.com/client/v4/accounts/xxxx/r2/buckets/test/objects/_.slug_.abcd1234.css

Then you will see a different error (which is expected and comes from the R2 API itself).

I note that if I step through the Wrangler code I see the same HTML response as the first URL above when trying to put an object to R2 that contains two or more dots.