gawel / pyquery

A jquery-like library for python
http://pyquery.rtfd.org/
Other
2.3k stars 182 forks source link

PyQuery doesn't recognize a and img but jQuery does #122

Closed capital-G closed 8 years ago

capital-G commented 8 years ago

Hi there,

I tried to manipulate this DOM:

<a xmlns="http://www.w3.org/1999/xhtml" ui-sref="app.vod({id : vod.id})" href="/vod/20150417-041"><img ng-src="http://cdn.ruptly.tv/secure/vod/20150417-041/20150417-041_thumbnail.jpg?e22c5e1cc455327554498fe5df4b53c8fe22a23071c2bed6a628c5bd51e138e255df45fc284be0e611067d66984b6c928288154ab6b81180ef2e1ccc5ed383ef8961d4fad7001760c667bea36548c146b1d6ae8146071c9d52cca14cfc96251f8bb79693d3" src="http://cdn.ruptly.tv/secure/vod/20150417-041/20150417-041_thumbnail.jpg?e22c5e1cc455327554498fe5df4b53c8fe22a23071c2bed6a628c5bd51e138e255df45fc284be0e611067d66984b6c928288154ab6b81180ef2e1ccc5ed383ef8961d4fad7001760c667bea36548c146b1d6ae8146071c9d52cca14cfc96251f8bb79693d3"/><span class="duration ng-binding">1:37</span>\n            <p class="title ng-binding">Brazil: It\'s a dog\'s life at canine kindergarten in Sao Paulo</p><span class="published ng-binding">2015-04-17 13:03 (GMT)</span><!-- ngIf: vm.isSubscriberOnly(vod) --><!-- ngIf: vm.isFree(vod) --></a>

In PyQuery (from pyquery import PyQuery as pq) I don't get proper results:

>>> pq(string)('a').attr('href')  # returns nothing
>>> pq(string)('img').attr('src')  # returns nothing
>>> pq(string)('.title').text()
"Brazil: It's a dog's life at canine kindergarten in Sao Paulo"

But jQuery returns everything properly: https://jsfiddle.net/d500pqz2/

console.log($('a').attr('href'));  // /vod/20150417-041
console.log($('img').attr('src'));  // http://cdn.ruptly.tv/secure/vod/20150417-041/20150417-041_thumbnail.jpg?e22…8961d4fad7001760c667bea36548c146b1d6ae8146071c9d52cca14cfc96251f8bb79693d3
console.log($('.title').text());  // Brazil: It\'s a dog\'s life at canine kindergarten in Sao Paulo

Hope this is enough help

gawel commented 8 years ago

You need to remove namespaces http://pythonhosted.org//pyquery/api.html?highlight=namespace#pyquery.pyquery.PyQuery.remove_namespaces

capital-G commented 8 years ago

Oh, didn't know about that, never heard of HTML-namespaces. Is there a reason it behaves different than jQuery?

gawel commented 8 years ago

just because lxml is taking care of namespaces even if it's useless for html