philss / floki

Floki is a simple HTML parser that enables search for nodes using CSS selectors.
https://hex.pm/packages/floki
MIT License
2.05k stars 155 forks source link

Feature/Idea: support case-insensitive CSS selectors #350

Closed fireproofsocks closed 2 years ago

fireproofsocks commented 3 years ago

It would be nice if it were possible to do case-insensitive CSS selection.

For example, given some HTML like the following :

<meta name="ROBOTS" content="INDEX, FOLLOW, NOIMAGEINDEX"/>

the following find operation would fail:

Floki.find(ast, "meta[name=\"robots\"]")

whereas the following find operation would succeed:

Floki.find(ast, "meta[name=\"ROBOTS\"]")

An implementation of case-insensitive selectors would allow the i flag:

Floki.find(ast, "meta[name=\"robots\" i]")

and this would match any variant of the ROBOTS capitalization.

For some discussion on case-insensitive CSS selectors: https://css-tricks.com/attribute-selectors/

fcapovilla commented 2 years ago

I've created a pull request implementing this feature. The "i" flag is now detected by the lexer if it's preceded by a space and followed by a ] in the attribute selector definition. If the "i" flag is detected, the attribute selector will be case-insensitive. The "s" flag is also detected, but will do the default case-sensitive matching.

I've also added tests for all match operators.