unjs / fontaine

Automatic font fallback based on font metrics
MIT License
1.47k stars 23 forks source link

The URL_RE doesn't work if the font URL contains a ")" close paren character #296

Open alex-sherwin opened 8 months ago

alex-sherwin commented 8 months ago

🐛 The bug

There are some packages, for example, the inter-ui NPM package (which is recommended by the official inter GH readme for NPM) which, very unfortunately, have parens in some of the folder names.

For example from the inter-ui package the fonts are under folders with parens in their names.

Example font file: inter-ui/Inter (web)/Inter-Regular.woff2

The URL_RE that fontaine is using is defined as

const URL_RE = createRegExp(
  exactly("url(").and(charNotIn(")").times.any().as("url")).and(")"),
  ["g"]
);

So no matter what it breaks with these URLs.

The problem is that source URL would get regex'd out as inter-ui/Inter (web and then the next check is to see if the source ends with one of the supported extensions (and since the regex mangled the URL, the extension was lost).

🛠️ To reproduce

https://stackblitz.com/edit/github-hdufeg?file=index.css

🌈 Expected behaviour

Parsing of the source url should support any valid URL characters

ℹ️ Additional context

No response

alex-sherwin commented 8 months ago

MDN has some details about the flexible varaitions that the CSS url() supports.

Based on the description and some googling, I'm not sure that this formal spec can be succinctly represented by a single regex.

Nearly all answers and examples for this specific parsing task that I can find use a naive variation of what this library is already doing.

alex-sherwin commented 8 months ago

I’ve played with bringing in lightningcss and using a url visitor to aggregate all valid urls in a font face rule instead of the URL_RE behavior

I believe it works well, except, it will behave differently for the existing test case of invalid syntax on the family name when the quotes aren’t closed properly.

I’ll look to open a PR and possibly hide it behind a feature flag

alex-sherwin commented 8 months ago

Using lightningcss to parse the @font-face rules definitely works, however, it would be in a stricter, more standards compliant way.

This means potential behavioral differences to how this library currently behaves, which is using some relatively naive regexes that make it usually do the right thing even when it's presented with relatively or very invalid @font-face rule definitions....

Happy to pursue this avenue if there's interest, otherwise, maybe a URL_RE update with a better regex (but still imperfect) would be sufficient

alex-sherwin commented 8 months ago

This may be a good regex to use for the URL_RE in lieu of using a full blown CSS AST parser:

url\((?:(?:"((?:\\.|[^"\\])*)")|(?:'((?:\\.|[^'\\])*)')|((?:[^'")\\]*(?:\\.[^'")\\]*)*)))\)

Seems to work reasonably well and accounts for most of the ways a valid CSS url() can be specified, including unquoted URLs with escaped parens, or quoted URLs with parens (no escaping needed, as per spec)

Works with the following input URLs

url("https://somesite.com/static/ok.ttf")
url("http://somesite.com/static/ok.ttf")
url("http://www.somesite.com:8080/static/ok.woff2")
url("https://www.somesite.com:443/static/ok.woff2")
url("https://ok.)woff")
url('https://somesite.com/static/ok.ttf')
url('http://somesite.com/static/ok.ttf')
url('http://www.somesite.com:8080/static/ok.woff2')
url('https://www.somesite.com:443/static/ok.woff2')
url('https://ok.)woff')
url(https://somesite.com/static/ok.ttf)
url(http://somesite.com/static/ok.ttf)
url(http://www.somesite.com:8080/static/ok.woff2)
url(https://www.somesite.com:443/static/ok.woff2)
url(https://ok.\)woff)
url(ok)
url("ok")
url('ok')
url('is )ok.woff')
url(what\ ablut)
url(isthis\)ok.woff)
url(../node_modules/some-font/with\ \(parens\)/ok.woff)
url("../node_modules/some-font-with (parens)/ok.woff")
url('../node_modules/some-font-with (parens)/ok.woff')
url(\as+fa==)

See https://regex101.com/r/7eVGaW/2