Closed GoogleCodeExporter closed 8 years ago
I can reproduce, when I try the script I get a segmentation fault on the second
image. I was able to temporally fix it commenting this line in browser.py:
"manager.setCookieJar(self.manager.cookieJar())". I am not sure if I am doing
something wrong or it's simply a pyqt bug.
However, I must say: spynner is not the right tool for the task you are doing.
Spynner *may* be useful for AJAX-intensive sites, but for a simple scrapping
like it's overkill. I recommend urllib2 + pyquery, you don't need Javascript
processing, do you?
import urllib2
html =
urllib2.urlopen("http://www.meinv86.com/meinv/xiaoyuanmeinvtupian/list_7_3.html"
).read()
dom = pyquery.PyQuery(html)
and so on.
Original comment by tokland
on 9 Apr 2011 at 9:29
Thanks Arnau. For this example, using spynner might be overkill, but for most
cases, I need to login to certain js-backed site to download pictures, musics
and etc. and spynner seems to be a must. and I notice that spynner is very good
at downloading large size files. Can you quickly take a look to make the
download function continue to work, or you have any good suggestions to combine
urllib2 with spynner for the downloading jobs. Recently I am studying about
Gevent, don't know if you have any ideas about integrate spynner into a
multi-threading model. Thanks
Original comment by jackhome...@sina.com
on 10 Apr 2011 at 5:40
If you have authenticated sites, I'd still use urllib2, yet with cookies.
Indeed, it's more work because you have to figure out how to authenticate and
use cookies, but on the long run the script is more robust.
Original comment by tokland
on 10 Apr 2011 at 8:30
Hi tokland, I seem to having a possibly related problem with using
Browser.download(). It does not seem to be passing the cookies properly.
Original comment by hari...@gmail.com
on 13 Jul 2011 at 2:57
Original issue reported on code.google.com by
jackhome...@sina.com
on 9 Apr 2011 at 7:41