eventuallyc0nsistent / arachne

A flask API for running your scrapy spiders
http://arachne.readthedocs.org/en/latest/
Other
128 stars 36 forks source link

CLOSESPIDER_PAGECOUNT Setting doesn't work for me #15

Open ghostku opened 6 years ago

ghostku commented 6 years ago

I added this to my settings.py but it doesn't work

SPIDER_SETTINGS = [
    {
        'endpoint': 'dmoz',
        'location': 'spiders.dmoz',
        'spider': 'DmozSpider',
        'scrapy_settings': {
            'ITEM_PIPELINES': {
                'pipelines.AddTablePipeline': 500
            },
            'CLOSESPIDER_PAGECOUNT': 2
        }
    }
]

UPD: It looks like any other scrapy settings doesn;t work too. Even ITEM_PIPELINES

$ python app.py
2017-11-28 01:25:33 [scrapy.utils.log] INFO: Scrapy 1.4.0 started (bot: scrapybot)
2017-11-28 01:25:33 [scrapy.utils.log] INFO: Overridden settings: {}
2017-11-28 01:25:39 [py.warnings] WARNING: C:\Users\ghost\Google Диск\Active Projects\Contacts Parser Arachne\spiders\doska_orbita_co_il.py:3: ScrapyDeprecationWarning: Module `scrapy.linkextractor` is deprecated, use `scrapy.linkextractors` instead
  from scrapy.linkextractor import LinkExtractor

2017-11-28 01:25:39 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.logstats.LogStats']
2017-11-28 01:25:39 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2017-11-28 01:25:39 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2017-11-28 01:25:39 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2017-11-28 01:25:39 [scrapy.core.engine] INFO: Spider opened
2017-11-28 01:25:39 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
eventuallyc0nsistent commented 6 years ago

Thanks @ghostku for opening an issue. I will take a look at it this week. I haven't update this project in a while so haven't tested for scrapy 1.0+.