guzzle / psr7

PSR-7 HTTP message library
MIT License
7.84k stars 1 forks source link

Parsing Link header doesn’t handle URLs with commas #595

Closed jonnybarnes closed 3 months ago

jonnybarnes commented 4 months ago

PHP version: 8.2.12

Description Possible parsing issue of the Link header, but also possible the site isn’t being standards compliant.

How to reproduce Parse the Link header sent from this URL: https://cloudinary.com/blog/jpeg-xl-and-the-pareto-front

For completeness here’s the current response:

<https://res.cloudinary.com>; rel="preconnect", <https://res.cloudinary.com>; rel="dns-prefetch", <https://use.typekit.net>; rel="preconnect"; crossorigin, <https://use.typekit.net>; rel="preconnect", <https://use.typekit.net>; rel="dns-prefetch", <https://p.typekit.net>; rel="preconnect", <https://p.typekit.net>; rel="dns-prefetch", <www.googletagmanager.com>; rel="preconnect", <www.googletagmanager.com>; rel="dns-prefetch", <https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_750/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA>; rel="preload"; as="image"; imagesizes="(min-width: 77.5em) 750px, (min-width: 62em) calc(100vw - 320px - 4.25em - 5.625em), (min-width: 60em) calc(100vw - 320px - 4.25em), 100vw"; imagesrcset="https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_320/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 320w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_610/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 610w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_787/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 787w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_872/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 872w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_1014/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 1014w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_1153/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 1153w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_1274/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 1274w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_1375/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 1375w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_1498/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 1498w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_1606/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 1606w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_1707/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 1707w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_1798/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 1798w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_1892/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 1892w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_1968/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 1968w, https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_2000/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i=AA 2000w", <https://cloudinary.com/blog/wp-json/>; rel="https://api.w.org/"

It has a URL with a comma in:

https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_750/f_auto,q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg

The comma being in there is messing stuff up, calling Header::parse gives this array

[
  [
    0 => "<https://res.cloudinary.com>",
    "rel" => "preconnect",
  ],
  [
    0 => "<https://res.cloudinary.com>",
    "rel" => "dns-prefetch",
  ],
  [
    0 => "<https://use.typekit.net>",
    "rel" => "preconnect",
    1 => "crossorigin",
  ],
  [
    0 => "<https://use.typekit.net>",
    "rel" => "preconnect",
  ],
  [
    0 => "<https://use.typekit.net>",
    "rel" => "dns-prefetch",
  ],
  [
    0 => "<https://p.typekit.net>",
    "rel" => "preconnect",
  ],
  [
    0 => "<https://p.typekit.net>",
    "rel" => "dns-prefetch",
  ],
  [
    0 => "<www.googletagmanager.com>",
    "rel" => "preconnect",
  ],
  [
    0 => "<www.googletagmanager.com>",
    "rel" => "dns-prefetch",
  ],
  [
    "<https://res.cloudinary.com/cloudinary-marketing/images/c_fill",
  ],
  [
    "w_750/f_auto",
  ],
  [
    "q_auto/v1708730240/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i" => "AA>",
    "rel" => "preload",
    "as" => "image",
    "imagesizes" => "(min-width: 77.5em) 750px, (min-width: 62em) calc(100vw - 320px - 4.25em - 5.625em), (min-widt
h: 60em) calc(100vw - 320px - 4.25em), 100vw",
    "imagesrcset" => "https://res.cloudinary.com/cloudinary-marketing/images/c_fill,w_320/f_auto,q_auto/v1708730240
/jpgxl_pareto_front-blog/jpgxl_pareto_front-blog-jpg?_i",
  ],
  [
    0 => "<https://cloudinary.com/blog/wp-json/>",
    "rel" => "https://api.w.org/",
  ],
]
GrahamCampbell commented 4 months ago

Thanks for getting in touch. I believe that's because that URL is invalid according to the RFC that defines the syntax for a URL. The commas need to be replaced with %2C.