Closed flakey5 closed 10 months ago
Quick question: Why does it crash there and not here? https://github.com/nodejs/release-cloudflare-worker/blob/main/src/handlers/get.ts#L29
Quick question: Why does it crash there and not here? https://github.com/nodejs/release-cloudflare-worker/blob/main/src/handlers/get.ts#L29
Honestly I don't know, my best guess is that calling new URL
doesn't validate the path and instead focuses just on the structure of the url itself?
Quick question: Why does it crash there and not here? https://github.com/nodejs/release-cloudflare-worker/blob/main/src/handlers/get.ts#L29
Honestly I don't know, my best guess is that calling
new URL
doesn't validate the path and instead focuses just on the structure of the url itself?
Could also be lazy construction/validation
I'm not sure you're supposed to call decodeURIComponent
on the pathname. The URL API already decodes automatically what should be decoded.
I'm not sure you're supposed to call
decodeURIComponent
on the pathname. The URL API already decodes automatically what should be decoded.
Right, I think it is indeed unnecessary as URL already does the sanitizations that are needed on the pathname
property.
Off-topic: @flakey5 I believe https://github.com/nodejs/release-cloudflare-worker/blob/b854489a8b2e331d1d5f2c9b8f7d43bf4a539537/src/util.ts#L53-L61 things like these, should be in the constants file. We should have all sorts of mappings like these in dedicated constants and explain them. The code is already becoming complex enough that often it's hard to find our way in some places.
Like I didn'te ven realise mapUrlPathToBucketPath
was doing so much.
The call to decodeURIComponent
isn't necessarily to validate the path, it's to decode the path from say /a%20b%20c
to /a b c
. From my testing in both workers and node the URL object seemingly doesn't decode the path for this automatically.
We should have all sorts of mappings like these in dedicated constants and explain them
Yeah that sounds like a good idea as well
The call to
decodeURIComponent
isn't necessarily to validate the path, it's to decode the path from say/a%20b%20c
to/a b c
. From my testing in both workers and node the URL object seemingly doesn't decode the path for this automatically.
Yes, but why do we need it? There are no legitimate URLs that need percent encoding in the pathname.
Another approach could be to call decodeURI
on the entire URL before passing it to the URL constructor?
Another approach could be to call
decodeURI
on the entire URL before passing it to the URL constructor?
That sounds the best approach here, and to catch the broken URL directly when calling `new UR. (inside the parseURL) (with try/catch) so we return undefined -> BAD_REQUEST directly from there
If we don't need to validate the uri I think we can just go without it, if it becomes a need later on (e.g. file names with spaces) we can add it back
If we don't need to validate the uri I think we can just go without it, if it becomes a need later on (e.g. file names with spaces) we can add it back
I think the idea was to validate the url within parseUrl
when we create new URL
and have try/catch there to then return undefined.
But I guess we can also simply do it like this for now, and be reactive if errors arise.
See https://nodejs-org.sentry.io/issues/4622681006/events/3a4e26a6669c46ffb84f20369e90ac8c/