gregberge / svgr

Transform SVGs into React components 🦁
https://react-svgr.com
MIT License
10.52k stars 418 forks source link

@svgr/hast-util-to-babel-ast erroneously treats semicolons in XML entities as style delimiters #963

Open SethFalco opened 4 months ago

SethFalco commented 4 months ago

🐛 Bug Report

The @svgr/hast-util-to-babel-ast package doesn't split styles correctly, and so breaks some SVGs that are processed through it.

It doesn't account for XML entities such as " or <, which both contain a semicolon which is acceptable when styles are defined in HTML, and the semicolon should not be considered a delimiter.

To Reproduce

I've written a test case for hast-util-to-babel-ast:

packages/hast-util-to-babel-ast/src/index.test.ts

it('properly converts style with xml entities', () => {
  const code = `<svg><text style="font-family:SFMono-Regular,Menlo,Monaco,Consolas,&quot;Liberation Mono&quot;,&quot;Courier New&quot;,monospace;font-weight:700">svgo --help</text></svg>`
  expect(transform(code)).toMatchInlineSnapshot(`
    "<svg><text style={{
        fontFamily: "SFMono-Regular,Menlo,Monaco,Consolas,&quot;Liberation Mono&quot;,&quot;Courier New&quot;,monospace",
        fontWeight: 700
      }}>{"svgo --help"}</text></svg>;"
  `)
})

Or alternatively, a test case for core:

packages/core/src/transform.test.ts

it('should handle styles with with xml entities', async () => {
  const result = await convertWithAllPlugins(
    `<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 128 32">
<style>
  #svgo-help {
    font-family: SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace;
    font-weight: 700;
  }
</style>
<text x="0" y="0" id="svgo-help">svgo --help</text>
</svg>
\0`,
  )

  expect(result).toMatchSnapshot()
  expect(result).not.toContain('\0')
})

This currently creates the following snapshot, which is wrong but demonstrates the issue:

exports[`convert should handle styles with HTML escaped quotes correctly 1`] = `
"import * as React from 'react'
const SvgComponent = (props) => (
  <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 128 32" {...props}>
    <text
      style={{
        fontFamily: 'SFMono-Regular,Menlo,Monaco,Consolas,&quot',
        fontWeight: 700,
      }}
    >
      {'svgo --help'}
    </text>
  </svg>
)
export default SvgComponent
"
`;

When SVGO serializing the font-family property back to a string, it uses XML entities to escape the quotes, so:

SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace becomes SFMono-Regular,Menlo,Monaco,Consolas,&quot;Liberation Mono&quot;,&quot;Courier New&quot;,monospace

Then plugin-jsx erroneously splits the style attribute by ;, which splits the style in the middle of property values as XML entities contain a semicolon too.

Expected behavior

Semicolons that are part of an XML entity should be treated as a delimiter/separator for style properties.

The best solution would probably be to use an actual CSS parser. If that's not favorable for whatever reason, then there should be a splitStyles method defined in packages/hast-util-to-babel-ast/src/stringToObjectStyle.ts which goes character by character through the raw styles to avoid XML entities.

I'd be happy to look at this for you if you can confirm if you'd prefer to use a CSS parser, or a DIY solution?

Run npx envinfo --system --binaries --npmPackages @svgr/core,@svgr/cli,@svgr/webpack,@svgr/rollup --markdown --clipboard

## System:
 - OS: Linux 6.6 Debian GNU/Linux trixie/sid
 - CPU: (16) x64 13th Gen Intel(R) Core(TM) i7-1360P
 - Memory: 16.61 GB / 62.65 GB
 - Container: Yes
 - Shell: 5.2.21 - /bin/bash
## Binaries:
 - Node: 20.11.0 - ~/.asdf/installs/nodejs/20.11.0/bin/node
 - Yarn: 1.22.22 - ~/.asdf/installs/nodejs/20.11.0/bin/yarn
 - npm: 10.7.0 - ~/.asdf/plugins/nodejs/shims/npm
 - pnpm: 8.15.4 - ~/.asdf/installs/nodejs/20.11.0/bin/pnpm
SethFalco commented 4 months ago

I've written a version of the proposed splitStyles method that doesn't use a CSS parser. This should work, or at least will be better than what we have now. Happy to open a PR, but hoping to get clarity on if you'd prefer to take the CSS parser or DIY route first.

/**
 * Split raw styles into separate properties.
 */
const splitStyles = (rawStyle: string) => {
  const entries = []
  let currentEntry = ''

  for (const char of rawStyle) {
    if (char !== ';' || /&(?:[a-zA-Z]+|#\d+)(?<!&semi|&#59)$/.test(currentEntry)) {
      currentEntry += char
      continue
    }

    entries.push(currentEntry)
    currentEntry = ''
  }

  if (currentEntry.length !== 0) {
    entries.push(currentEntry)
  }

  return entries
}

Usage would be to change:

-   const entries = rawStyle.split(';')
+   const entries = splitStyles(rawStyle)