Open alex-sherwin opened 8 months ago
MDN has some details about the flexible varaitions that the CSS url()
supports.
Based on the description and some googling, I'm not sure that this formal spec can be succinctly represented by a single regex.
Nearly all answers and examples for this specific parsing task that I can find use a naive variation of what this library is already doing.
I’ve played with bringing in lightningcss and using a url visitor to aggregate all valid urls in a font face rule instead of the URL_RE behavior
I believe it works well, except, it will behave differently for the existing test case of invalid syntax on the family name when the quotes aren’t closed properly.
I’ll look to open a PR and possibly hide it behind a feature flag
Using lightningcss to parse the @font-face rules definitely works, however, it would be in a stricter, more standards compliant way.
This means potential behavioral differences to how this library currently behaves, which is using some relatively naive regexes that make it usually do the right thing even when it's presented with relatively or very invalid @font-face
rule definitions....
Happy to pursue this avenue if there's interest, otherwise, maybe a URL_RE
update with a better regex (but still imperfect) would be sufficient
This may be a good regex to use for the URL_RE
in lieu of using a full blown CSS AST parser:
url\((?:(?:"((?:\\.|[^"\\])*)")|(?:'((?:\\.|[^'\\])*)')|((?:[^'")\\]*(?:\\.[^'")\\]*)*)))\)
Seems to work reasonably well and accounts for most of the ways a valid CSS url()
can be specified, including unquoted URLs with escaped parens, or quoted URLs with parens (no escaping needed, as per spec)
Works with the following input URLs
url("https://somesite.com/static/ok.ttf")
url("http://somesite.com/static/ok.ttf")
url("http://www.somesite.com:8080/static/ok.woff2")
url("https://www.somesite.com:443/static/ok.woff2")
url("https://ok.)woff")
url('https://somesite.com/static/ok.ttf')
url('http://somesite.com/static/ok.ttf')
url('http://www.somesite.com:8080/static/ok.woff2')
url('https://www.somesite.com:443/static/ok.woff2')
url('https://ok.)woff')
url(https://somesite.com/static/ok.ttf)
url(http://somesite.com/static/ok.ttf)
url(http://www.somesite.com:8080/static/ok.woff2)
url(https://www.somesite.com:443/static/ok.woff2)
url(https://ok.\)woff)
url(ok)
url("ok")
url('ok')
url('is )ok.woff')
url(what\ ablut)
url(isthis\)ok.woff)
url(../node_modules/some-font/with\ \(parens\)/ok.woff)
url("../node_modules/some-font-with (parens)/ok.woff")
url('../node_modules/some-font-with (parens)/ok.woff')
url(data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSKudfOulrSOp3WOyDZu6QdvCchPGolfO0o\as+fa==)
🐛 The bug
There are some packages, for example, the inter-ui NPM package (which is recommended by the official inter GH readme for NPM) which, very unfortunately, have parens in some of the folder names.
For example from the inter-ui package the fonts are under folders with parens in their names.
Example font file:
inter-ui/Inter (web)/Inter-Regular.woff2
The
URL_RE
that fontaine is using is defined asSo no matter what it breaks with these URLs.
The problem is that source URL would get regex'd out as
inter-ui/Inter (web
and then the next check is to see if the source ends with one of the supported extensions (and since the regex mangled the URL, the extension was lost).🛠️ To reproduce
https://stackblitz.com/edit/github-hdufeg?file=index.css
🌈 Expected behaviour
Parsing of the source url should support any valid URL characters
ℹ️ Additional context
No response