octohedron / pikatest

Pika test for RabbitMQ integration with scrapy
4 stars 1 forks source link

have you fix this issue for not executing parse function ? #1

Open Rockyzsu opened 5 years ago

Rockyzsu commented 5 years ago

I saw you issue this problem here https://github.com/scrapy/scrapy/issues/3477

have you fix this ? i met the same issue when try to use rabbitmq + scrapy.

octohedron commented 5 years ago

Hello, I apologize for the late reply, wasn't watching this repo, it was just an example.

Hopefully you found a solution by now, but just in case, what we ended up doing was using a buffer in redis, i.e. using redis as shared application state where we store a queue of URLs to be crawled.

It's quite simple, you need to implement something like

Then, in the spider, you can fetch from redis and send to RabbitMQ when processing items if you wish.

Although it does what we need it did took some developer time to implement, it would have been easier if scrapy had a simpler way to integrate with message queues, but there are other advantages to our approach with redis.