gawel / pyquery

A jquery-like library for python
http://pyquery.rtfd.org/
Other
2.29k stars 182 forks source link

Erroneous html parsing #27

Open karolyi opened 11 years ago

karolyi commented 11 years ago

Hi, when i try to parse a html string with an image in it, the text after the image element is appended the original image tag. Tested in python command line:

>>> import pyquery
>>> a = pyquery.PyQuery('<p>asdasd<img src="a/b">asdasdddddddd</p>')
>>> print a
<p>asdasd<img src="a/b"/>asdasdddddddd</p>
>>> a.find('img')
[<img>]
>>> print a.find('img')
<img src="a/b"/>asdasdddddddd

the parser behind pyquery is lxml 3.1.0.

Suggestions?

gawel commented 10 years ago

No. Looks like not so easy to solve