linkextractor Search Results

354 results
for linkextractor

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

scrapy/scrapy #163

Empty results from SgmlLinkExtractor

SgmlLinkExtractor returns no links for http://metaoptimize.com/qa/

turian updated 10 years ago
6
scrapy/scrapy #528

lxml based link extractor with similar functionnality to Sgm…

`SgmlLinkExtractor` can choke on some pages that lxml is fine with. - https://groups.google.com/forum/#!topic/scrapy-users/iA1VzcJYpJE Currently, `LxmlParserLinkExtractor` doesnt have some of `SgmlLi…

redapple updated 10 years ago
3
scrapy/scrapy #731

SGMLParseError when looking for links with SGMLLinkExtractor

I'm having an exception when extracting links for a site. It can be reproduced by: ``` $ scrapy shell 'http://www.cnea.gov.ar/' >>> from scrapy.contrib.linkextractors import sgml >>> e = sgml.SgmlLin…

dmoisset updated 10 years ago
4
scrapy/scrapy #755

SGMLParseError

I often have problems with the `SgmlLinkExtractor`. Lets try: ``` scrapy shell "http://www.dachser.com/de/de/" # in the shell from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor link_ext…

bijzz updated 10 years ago
1
scrapy/scrapy #813

crawlspider doesn't listen deny rule

Here is my spider ``` python from scrapy.contrib.linkextractors import LinkExtractor from scrapy.contrib.spiders import CrawlSpider, Rule from scrapy.selector import Selector from vitrinbot.items im…

muhasturk updated 10 years ago
3
medialab/hyphe #80

Bug in scrapy extract_links

with this pure html page http://www.freetocharities.org.uk/zimconserve/

paulgirard updated 10 years ago
2
torce/crawler #11

Problema de rendimiento

Con la configuración actual, el rendimiento es muy bajo, sin embargo el uso CPU se dispara al 100% en muy poco tiempo. A veces el crawler se queda bloqueado, pero el uso de CPU continúa al 100%.

torce updated 10 years ago
1
scrapy/scrapy #652

SgmlLinkExtractor default value for 'attrs' argument should …

It is now a string, and `attrs_func` doesn't make sense if `attrs` is a string. See https://github.com/scrapy/scrapy/blob/master/scrapy/contrib/linkextractors/sgml.py#L98

kmike updated 10 years ago
4
scrapy/scrapy #562

UnicodeEncodeError in SgmlLinkExtractor when using restrict_…

This issue has been addressed before by #285 but at the time none of the proposed alternative solution have made it into Scrapy. Even though the solution proposed in #285 was a good workaround, it wa…

rmax updated 10 years ago
14
scrapy/scrapy #199

Exception UnicodeDecodeError in linkextractors

``` File "/lib/python2.6/site-packages/Scrapy-0.16.2-py2.6.egg/scrapy/contrib/linkextractors/sgml.py" line 84, in handle_data self.current_link.text = self.current_link.text + data.strip() exceptions…

dzyao updated 11 years ago
3

上一页 1...30 31 32 33 34 35 36...36 下一页

354 results for linkextractor

354 results
for linkextractor