kripod / otion

Atomic CSS-in-JS with a featherweight runtime
MIT License
631 stars 24 forks source link

Incorrect selector splitting & broken styles #45

Closed eddyw closed 4 years ago

eddyw commented 4 years ago

Description

When a selector (in selectors css prop) contains a , (COMMA), it may be split incorrectly.

Reproduction & Expected behavior

For instance:

  1. within attribute selector & quoted

    css({
    selectors: {
    '&:[data-list="a,b"]': { color: "green" },
    },
    })

    It generates:

    ._cd2tw1:[data-list="a{color:green}
    ._1xk4x6tb"]{color:green}
  2. With escaped COMMA (,) in attribute selector:

    css({
    selectors: {
    '&:[data-list=a\\,b]': { color: "green" },
    },
    })

    It generates:

    ._5g05z7:[data-list=a\{color:green}
    ._7eu5q1b]{color:green}._3sn2xs{color:hotpink}

Expected behavior

A selector group should be split correctly. Escaped values are valid in selectors, so a COMMA (,) may appear outside of a quoted value, e.g:

.c[data-value=a\,b] {
  color: red;
}

In:

<h1 class="c" data-value="a,b">
  This is an example
</h1>

Actual behavior

It is split as rule.split(',') which incorrectly splits valid group selectors

kripod commented 4 years ago

Thank you for the detailed report, I'll look into how this behavior should be improved.

@all-contributors please add @eddyw for reporting a bug!

allcontributors[bot] commented 4 years ago

@kripod

I've put up a pull request to add @eddyw! :tada:

kripod commented 4 years ago

I've been working on this with moderate success. I would appreciate if you could help in eliminating false matches by modifying the linked regular expression.

References for the values accepted by an attribute selector: <ident-token>, <string-token>

eddyw commented 4 years ago

@kripod I'm not quite sure how to achieve this with a single RegExp 🤔 In any case, Otion doesn't need to validate if a selector is valid or not, it just needs to know when to split a string.

From CSS syntax, these 👇 are known to create groups, so anything within a group shouldn't be split (regardless if it's valid or not) + escaped \, shouldn't be used as split index:

const MATCHES = [
  /\"([^\"]*)\"?/g, // R_STR_DBL
  /\'([^\']*)\'?/g, // R_STR_SGL
  /\[([^\]]*)\]?/g, // R_SQUARE
  /\(([^\)]*)\)?/g, // R_PARENT
  /\{([^\}]*)\}?/g, // R_CURLY
  /\\,/, // R_ESCAPED_COMMA
];
function toPlaceholder(m) {
  return "_".repeat(m.length);
}
function getSelectors(rawInput) {
  let input = rawInput;

  MATCHES.forEach((r) => (input = input.replace(r, toPlaceholder)));

  const selectors = input.split(',')

  for (let i = 0, lastIndex = 0; i < selectors.length; i++) {
    const current = selectors[i]
    selectors[i] = rawInput.substr(lastIndex, current.length).trim()
    // Here you can validate selectors[i][0] === '&' or not
    lastIndex = lastIndex + current.length + 1;
  }

  return selectors
}

I know it's ugly lol but it works and maybe it could be improved somehow 🤷‍♂️ Since it doesn't validate, there are many ways of breaking this. For example, a CSS comment could be used as selector &:hover /* and it'd break everything. From w3c CSS syntax, the first thing the tokenizer does is consume comments and return nothing. It's fairly simple to remove CSS comments with RegExp. Besides that, some selectors may be valid in some browsers but invalid in others (e.g: IE11), so this just assumes you know how to define selectors and you just want to split them separating them by COMMA (,).

If a selectors parser and validator is within the scope of otion to implement, then I'd suggest to have a look at:

  1. https://github.com/css-modules/css-selector-tokenizer
  2. https://github.com/fb55/css-what

Those are the only 2 projects I know that are up to date and seem to follow w3c specs (CSS syntax tokenizer & parser). However, I don't know of any CSS-in-JS lib .. or any library that actually does validation of CSS selectors 🤷‍♀️ . All assume you know what you're doing 😅

kripod commented 4 years ago

Thank you for these well-detailed insights! At first, I was thinking about providing a fully-featured solution for this. However, keeping the small bundle size intact, I would still prefer a RegExp which only matches commas encapsulated by attribute selectors (e.g. &[data-list="a,b"]) or pseudo-class functions (e.g. & :is(h1,h2,h3)). I'm not quite sure whether that's possible with just RegExp, though.

kripod commented 4 years ago

Interesting work about recursive regular expressions in JS has been done by Steven Levithan. The article might help with implementing an efficient solution.

kripod commented 4 years ago

Parsing is out of this project's scope, so we should consider this to be mostly resolved with RegExp. I've expanded the docs, linking to this issue as a reference.