Open iamhmx opened 6 years ago
This issue affects me too. Try print the first 200 characters of page_source
, then remove the attribute of <html>
. In my case, I have to do this for CSS selectors to work while I am scrapping Facebook WAP.
html = b.page_source.replace('<html xmlns="http://www.w3.org/1999/xhtml">', '<html>')
doc = pq(html)
when i use selenium get the "page_source", and find the elements by pyquery, not work; but when i use "doc = pq(url='https://xxxxx')" directly, it works well. codes below: part one:
works well! part two:
not work!