Closed marlass closed 1 month ago
On closer look, it appears there's no need to split these ourselves at all, given that the DOM element style
property already has this information. Does attached patch look reasonable to you?
Thanks! That looks nice.
I'm not only sure if the access to CSS property value through [name]
is fully compatible with theCSSStyleDeclaration
interface. I'm having trouble accessing it like this in the browser.
We are using zeed-dom
as a lightweight DOM parser and this change exposed the same issue there as well 😅 and additionally it seems that its style
getter interface is not at all compatible with CSSStyleDeclaration
. I'll report the issue there as well.
I'm not only sure if the access to CSS property value through [name] is fully compatible with theCSSStyleDeclaration interface. I'm having trouble accessing it like this in the browser.
Could you elaborate on this? What exactly is going wrong?
I guess for 99.9% of the use cases it's alright, but for example with CSS vars different access methods return different result.
Oh, indeed. I didn't know getPropertyValue
exists. That seems like a preferable approach. Adjusted in attached patch.
Thanks for such a quick fix! I guess it can be closed now 🎉
Running
DOMParser.fromSchema(schema).parse(dom, options).toJSON()
on HTML with base64 encoded CSS values is extremely slow and incorrect.Minimal HTML example to parse:
After digging a bit into the code it seems that the problem might be in tokenizer for style attribute
Assumption that
;
can be present only at the end of CSS rule is not correct. I think the common use case where that is not true is base64 encoded images (like in the example above). Tokenizer only returns beggining of the encoded value ('url(data:image/svg+xml'
).To make this even worse these values are often long (not hard to find ~10kb strings in the wild) which contributes to slow performance of regex.
Sample fix idea:
I'm sure that there are probably more cases than
url()
which could potentially cause this issue.content: "Random content:;"
comes into my mind, but I don't think it is that common (and especially in inline styles).That might be one of these cases where it's hard to find the right balance between spec compliance and correctness vs parsing speed.