gawel / pyquery

A jquery-like library for python
http://pyquery.rtfd.org/
Other
2.3k stars 182 forks source link

got the wrong html content which is not same as original content #72

Open a-whitej opened 10 years ago

a-whitej commented 10 years ago

from pyquery import PyQuery as pq from lxml import etree import urllib2

url_location='http://v.163.com/special/opencourse/machinelearning.html'

d = pq(url=url_location)

response = urllib2.urlopen(url_location) html = response.read() d = pq(html) print d.html()

We got the wrong html content which is not same as original content('http://v.163.com/special/opencourse/machinelearning.html')

gawel commented 10 years ago

I guess that's because it doesn't use the same parser See https://github.com/gawel/pyquery/blob/master/pyquery/pyquery.py#L195

pq(url=url, parser='xml') should do what you want