utkarshkukreti / select.rs

A Rust library to extract useful data from HTML documents, suitable for web scraping.
MIT License
959 stars 69 forks source link

Avoid recursion when parsing documents #78

Open walruscow opened 2 months ago

walruscow commented 2 months ago

Deeply nested documents can cause a stack overflow due to recursion. Avoid that by using a heap based stack instead.

This includes a first commit to infer the sibling index from the parent, so that the iterative approach is simpler

This would resolve issues #66 and (arguably) #68

Since there is no maximum depth option added here, it is possible to go fully OOM, but much larger documents are supported than previously.