nodejs / release-cloudflare-worker

Infra for serving Node.js downloads and documentation.
https://nodejs.org/dist
MIT License
22 stars 5 forks source link

fix: return a 400 for malformed URIs #64

Closed flakey5 closed 10 months ago

flakey5 commented 10 months ago

See https://nodejs-org.sentry.io/issues/4622681006/events/3a4e26a6669c46ffb84f20369e90ac8c/

ovflowd commented 10 months ago

Quick question: Why does it crash there and not here? https://github.com/nodejs/release-cloudflare-worker/blob/main/src/handlers/get.ts#L29

flakey5 commented 10 months ago

Quick question: Why does it crash there and not here? https://github.com/nodejs/release-cloudflare-worker/blob/main/src/handlers/get.ts#L29

Honestly I don't know, my best guess is that calling new URL doesn't validate the path and instead focuses just on the structure of the url itself?

ovflowd commented 10 months ago

Quick question: Why does it crash there and not here? https://github.com/nodejs/release-cloudflare-worker/blob/main/src/handlers/get.ts#L29

Honestly I don't know, my best guess is that calling new URL doesn't validate the path and instead focuses just on the structure of the url itself?

Could also be lazy construction/validation

targos commented 10 months ago

I'm not sure you're supposed to call decodeURIComponent on the pathname. The URL API already decodes automatically what should be decoded.

ovflowd commented 10 months ago

I'm not sure you're supposed to call decodeURIComponent on the pathname. The URL API already decodes automatically what should be decoded.

Right, I think it is indeed unnecessary as URL already does the sanitizations that are needed on the pathname property.

Off-topic: @flakey5 I believe https://github.com/nodejs/release-cloudflare-worker/blob/b854489a8b2e331d1d5f2c9b8f7d43bf4a539537/src/util.ts#L53-L61 things like these, should be in the constants file. We should have all sorts of mappings like these in dedicated constants and explain them. The code is already becoming complex enough that often it's hard to find our way in some places.

Like I didn'te ven realise mapUrlPathToBucketPath was doing so much.

flakey5 commented 10 months ago

The call to decodeURIComponent isn't necessarily to validate the path, it's to decode the path from say /a%20b%20c to /a b c. From my testing in both workers and node the URL object seemingly doesn't decode the path for this automatically.

We should have all sorts of mappings like these in dedicated constants and explain them

Yeah that sounds like a good idea as well

targos commented 10 months ago

The call to decodeURIComponent isn't necessarily to validate the path, it's to decode the path from say /a%20b%20c to /a b c. From my testing in both workers and node the URL object seemingly doesn't decode the path for this automatically.

Yes, but why do we need it? There are no legitimate URLs that need percent encoding in the pathname.

targos commented 10 months ago

Another approach could be to call decodeURI on the entire URL before passing it to the URL constructor?

ovflowd commented 10 months ago

Another approach could be to call decodeURI on the entire URL before passing it to the URL constructor?

That sounds the best approach here, and to catch the broken URL directly when calling `new UR. (inside the parseURL) (with try/catch) so we return undefined -> BAD_REQUEST directly from there

flakey5 commented 10 months ago

If we don't need to validate the uri I think we can just go without it, if it becomes a need later on (e.g. file names with spaces) we can add it back

ovflowd commented 10 months ago

If we don't need to validate the uri I think we can just go without it, if it becomes a need later on (e.g. file names with spaces) we can add it back

I think the idea was to validate the url within parseUrl when we create new URL and have try/catch there to then return undefined.

But I guess we can also simply do it like this for now, and be reactive if errors arise.