Closed Scarfmonster closed 2 years ago
I can confirm this issue in Parsel 1.6.0.
Apparently this is a byproduct of how lxml stores the text - it's a part of the preceding element, so removing the element also removes the text. I tried mitigating this in PR #207
I tried removing an element as a way to exclude some repeated text from a website. I used the following code:
results in:
I would expect only the span to be removed, and the text after it to be left as-is, but it always removes the "text after" either until another element is encountered or it hits the end of the parent of the removed one.