andybalholm / cascadia

CSS selector library in Go
BSD 2-Clause "Simplified" License
703 stars 65 forks source link

Case-insensitive selectors without regex using under the hood #49

Closed rvics closed 3 years ago

rvics commented 3 years ago

It would be nice to add a case-insensitive attribute search. Now this is possible only through regex which has a low performance. Instead it possible to add the css 4 syntax (https://css4-selectors.com/selector/css4/attribute-case-sensitivity/). Under the hood you can use the EqualFold Go function for checking equality which has much better performance against Regex

For example, this tags

<div class="Red">
<div class="red">

can be find by div[class="red" i] selector

andybalholm commented 3 years ago

Yes, that sounds like a good idea. But I don't have time to implement it right now. Pull request to implement CSS4 features are welcome.

kinoute commented 3 years ago

@andybalholm I'm trying to implement the ignore case feature but I struggle to fix the serialize tests (it's my first Go PR, excuse my terrible code).

Here is the commit: https://github.com/kinoute/cascadia/commit/6d042b53a9e6ed91c1a95de3ab82aa0173431c31

It was implemented only for the = attribute for now. When running the tests, I'm getting:

 serialize_test.go:34: can't retrieve selector from serialized : 
address[title="FoOIgnoRECaSe"] 
(original : address[title="FoOIgnoRECaSe" i], 
sel : cascadia.SelectorGroup{cascadia.compoundSelector{selectors:[]cascadia.Sel{cascadia.tagSelector{tag:"address"}, cascadia.attrSelector{key:"title", val:"FoOIgnoRECaSe", operation:"=", regexp:(*regexp.Regexp)(nil), insensitive:true}}, pseudoElement:""}})
--- FAIL: TestSerialize (0.00s)

Maybe my approach is bad? It looks like when testing, the serializer gets the CSS Selector from selectorTests but doesn't do any modification on it, therefore the i stays in the CSS attribute selector and doesn't match the real one.

andybalholm commented 3 years ago

I think you need to edit the attrSelector.String method. It needs to output the i when insensitive is true.

kinoute commented 3 years ago

@andybalholm Thanks! It did fix the issue. All tests pass now. Here is the commit https://github.com/kinoute/cascadia/commit/2615edc12dffe1cd70d090dd5f68060d16994a04

I will try to work on the other CSS attribute selectors, add tests and make my code better

andybalholm commented 3 years ago

Fixed by https://github.com/andybalholm/cascadia/commit/93fe7c5e7dc57e06e2c913c07571733300ce9cca