umijs / mako

An extremely fast, production-grade web bundler based on Rust.
https://makojs.dev
MIT License
1.78k stars 64 forks source link

bug: svgr-rs does not support unicode chars #1600

Open stormslowly opened 1 week ago

stormslowly commented 1 week ago

problem

...
<title>&amp;中文</title>
...
panicked at svgr-rs/src/hast_to_swc_ast/decode_xml.rs:60:21:
byte index 6 is not a char boundary; it is inside '文' (bytes 5..8) of `x&中文`

solution

we don't have to escape by hand, use the crate html-escape

will be fixed in pr #1444

stormslowly commented 1 week ago

the root cause is below svgr-rs

  1. use swc_xml_parser the svg content , in the html ast. title tag's text children content become &中文 ref
  2. then svgr-rs trying to convert the html ast to swc jsx ast, html entity is concerned, but doing by parsing it by bytes as_bytes()
  3. svgr-rs peeks 2 or 4 bytes after & to convert them to string , then hit byte index 6 is not a char boundary the error

IMO, the best way to solve the problem is , swc_xml_parser un-escaping the text children of html

stormslowly commented 1 week ago

besides this problem, Mako should put compile() in a catch_unwind in case this problem, node process should get an error other than process abort.

SyMind commented 6 days ago

@stormslowly I have published version 0.2.0 to incorporate your PR.