Open wetneb opened 8 years ago
Seed loaders are Scrapy spider middlewares. All the same rules should apply as to Scrapy middlewares. I need to know your Frontera cluster setup: backends, message bus and run mode to help you.
Thanks a lot for your reply! I'm using the distributed setup with ZeroMQ, and the default run mode. I can see that the meta parameters I introduce in the seeder are still available when the requests arrive in the DB and strategy workers.
What is the status of the converters here: https://github.com/scrapinghub/frontera/blob/master/frontera/contrib/scrapy/converters.py Are they involved in the conversion from the frontier request to the scrapy one? If so, when does that happen?
@wetneb What backend do you use? In case of HBase meta isn't persisted, but in SQLA backend it is. Converters are used in spider processes, and conversion happens all the time when request is read from Frontera and response is returned back.
@sibiryakov Thanks! I'm using frontera.contrib.backends.sqlalchemy.Distributed as a backend, so meta is indeed persisted there. I suspect meta disappears during the conversion process in the spider. I will try to debug that.
Changing the backend to 'frontera.contrib.backends.sqlalchemy.SQLAlchemyBackend' solved the issue indeed. But I needed to keep the Distributed backend for the strategy worker, is that normal? And what is the rationale behind keeping meta in one backend but not the other? Thanks a lot anyway!
@wetneb oh that's great you found it. https://github.com/scrapinghub/frontera/blob/master/frontera/worker/strategies/__init__.py#L90 It's not transferred for historical reasons, but it makes sense to do so. PR's are always welcome.
Excellent, I'll try to do that then. Thanks a lot!
Hi, I do not understand how to set
meta
parameters in a frontier Request generated from a seeder. It seems that there are two kinds of meta parameters: frontier ones and scrapy ones. I would like to set scrapy meta parameters so that my scrapy middlewares get to see them. It seems that they have to be set as meta['scrapy_meta'] = my_scrapy_meta, but when the request arrives in my middleware, these parameters disappear (only the 'frontier_request' argument remains). Any idea where this comes from? Should I translate my middleware to a Frontier middleware (that would work on frontier Requests)? Thanks a lot!