Closed rubbberrabbit closed 1 year ago
Please, use the latest version of Parsel (1.7.0 at the moment) to report issues.
In any case, I cannot reproduce this issue either way with your Python version and Parsel 1.6 or 1.7:
$ python --version
Python 3.9.12
$ cat test.py
from parsel import Selector
text = "<html><body><h1>Hello, Parsel!</h1></body></html>"
selector = Selector(text=text)
print(selector.css('h1'))
$ pip install parsel==1.6.0
[…]
$ python test.py
[<Selector xpath='descendant-or-self::h1' data='<h1>Hello, Parsel!</h1>'>]
$ pip install parsel==1.7.0
[…]
$ python test.py
[<Selector xpath='descendant-or-self::h1' data='<h1>Hello, Parsel!</h1>'>]
Maybe it is your lxml version that causes the issue?
In general, it is advisable to try upgrading all your dependencies and see if the issue still reproduces, to make sure it has not already been fixed in a new version of some of the dependencies.
I am using parsel.Selector to process my html file, but the result is unexpected, so i debug into the parsel api document to see if i am misuse the parsel and the xpath. But i find i get complete different result even with the first example and i think it is the reason why I get unexpected result in handling my html. the document i refer is https://parsel.readthedocs.io/en/latest/usage.html
the guidance shows the expected result is
but I get
the Selector return all nodes after
<h1>
instead of inside<h1>
. my python version is 3.9.12 and parsel version is 1.6.0