inikulin / parse5

HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant.
MIT License
3.65k stars 231 forks source link

lookup fails on SVG `viewBox` attribute. #318

Open calvinjuarez opened 4 years ago

calvinjuarez commented 4 years ago

AST Explorer link: https://astexplorer.net/#/gist/25e9a360157ab2e56a7bf14ac22daa5b/196c723e25cf99ec923c3ce0be36664817b264d2

<svg viewBox="0 0 100 100"></svg>

Given the SVG element above, the AST produces an svg element node whose attrs array has a { name: 'viewBox', value: '0 0 100 100' } object, but that name doesn't match anything in the element's own sourceCodeLocation.attrs hash. Instead, that hash has a viewbox (all lowercase) key. (See the AST Explorer screenshot below).

Is this by design? If so, is it recommended that location lookup always use .toLowerCase()? Or are SVG camelCase attributes an exception that should be handled in a special case in AST consumers? Have I missed documentation on this?

Or is this a bug? If so, there are a number of SVG camelCase attributes that should likely be handled similarly.

Screen Shot 2020-06-23 at 2 02 16 PM
calvinjuarez commented 4 years ago

I'm looking into a fix for this in the location info mixin. The foreign-content.js adjustTokenSVGAttrs method does the transformation from viewboxviewBox, but it (obviously) doesn't transform the key in the location.attrs hash created by the mixin. I'm not sure the best way to fix this. It feels bad to add logic to the core that depends on a mixin, but it also feels weird to expose what's essentially private data in the core to the mixin.

If anyone can offer direction on how best to go about adjusting the location key in the way the attribute name is adjusted, I'd appreciate the help.

Would it be an awful idea to import/require the SVG_ATTRS_ADJUSTMENT_MAP out of foreign-content.js into the plugin?

calvinjuarez commented 4 years ago

Update: I've got a branch with my current attempt at a fix that I'd like thoughts on: https://github.com/calvinjuarez/parse5/commit/62f7a59a3246b53b60c98b56b6a91e3a4edb5e6c

It feels a little under-cooked at the moment. I'd love tips on improvement.

calvinjuarez commented 4 years ago

So looking closer, my changes would actually transform attributes on non-SVG elements, which is probably bad news. I'll likely need to expose the "are we in an svg" information from core so that the plugin can access it.

fb55 commented 2 years ago

Interesting issue, and not easy to fix. Lowercasing all attribute names before the lookup is a valid workaround for now. Based on how the HTML tokenizer works, attribute names will always be unique by their lowercase representation, and attribute locations will always be referenced by the lowercase name.