James-LG / Skyscraper

Rust library for scraping HTML using XPath expressions
MIT License
30 stars 3 forks source link

Check if search index out of bounds #21

Closed masc-it closed 6 months ago

masc-it commented 7 months ago

Problem

Let's assume I want to search for a strong[1], with a parent node //div, but some of the divs don't have it. As the search is implemented right now, the code will just panic, since it does not handle out of bound indexing.

Solution Just add a simple check on the search index. I didn't fix it deeper, at the DocumentNodeSet level, since raw indexing is heavily used in a lot of places and would require more effort.

masc-it commented 7 months ago

@James-LG that's awesome news, thanks for your effort!

In the last week I was having a deep dive of the current main branch and noticed some things, which maybe you're already addressing in the new version:

BTW, I'll take a look to the new nom branch, happy to help if needed :)

James-LG commented 6 months ago

BTW, I'll take a look to the new nom branch, happy to help if needed :)

I'd like to get the basic use-cases working first, so the structure is a bit more settled than it currently is, but after that support on things like the contains functions would be great! XPath is pretty huge so there's lots of parallel work once the basics are in place.

masc-it commented 6 months ago

Clear! Do you have a roadmap in place? Or a discord channel to post updates?

James-LG commented 6 months ago

Since GitHub apparently doesn't have direct messaging I created a brand new discord channel https://discord.gg/jWK42bWK

As for roadmap, I don't have anything formal. Vaguely it will be getting basic steps working / and // (including initial occurrences which behave differently), then filtered expressions like /div[@class='hi'], and go from there.