egonSchiele / HandsomeSoup

Easy HTML parsing for Haskell
http://egonschiele.github.com/HandsomeSoup
BSD 3-Clause "New" or "Revised" License
124 stars 20 forks source link

Add support for more selectors #19

Open cstrahan opened 10 years ago

cstrahan commented 10 years ago

Hello! As part of my work on Happybara, I've implemented a CSS selector parser and a converter from CSS to XPath:

https://github.com/cstrahan/happybara/blob/to-query/happybara/src/Happybara/XPath.hs#L40 https://github.com/cstrahan/happybara/blob/to-query/happybara/src/Happybara/CSS.hs

If you're interested, I'd like to fold my implementation into HandsomeSoup.

If you wouldn't mind taking a dependency on hxt-xpath, I think we should be able to use my css-to-xpath conversion directly, using getXPathTreesInDoc.

Thoughts?

egonSchiele commented 10 years ago

Give me a little time to read over the code, but this sounds like a great idea. As an aside, happybara sounds cool.

cstrahan commented 10 years ago

Thanks!

In the next day or two, I'm going to write some fuzzing tests. The plan for the tests:

So, while there aren't any tests now, there will be pretty soon :).

Another idea: if we merge my implementation (or parts thereof), it might make sense to emit an Expr instead of a string. That would be ideal for two reasons:

Users could then render the Expr with formatXPathTree, if they wanted to (or we could provide a function for convenience).

I'm going to wrap up the test suite, and then I'll hack on a PR for you to review.

egonSchiele commented 9 years ago

Hmm, did anything happen with this? Sorry it took me so long to get to, and I don't see any changes anymore.

cstrahan commented 9 years ago

@egonSchiele No worries; I got caught up in other things. I'd still like to take a stab at this, but it will probably be a couple weeks before I can dive back in.

egonSchiele commented 9 years ago

It's fine...I have a lot of other stuff on my plate right now anyway :)