Caching system - Githubissues

kmike commented 3 years ago

This is for discussion.

It'd be nice to have a caching system for output of the providers; this might speed up re-running the spider after changes in page objects or callbacks.

It seems this is the place where cache could happen: https://github.com/scrapinghub/scrapy-poet/blob/f7ad036f62699513e10d5749910042a5268153fc/scrapy_poet/injection.py#L152

The main issue is how to compute the cache key, as kwargs might have different semantics for different providers. So, one option is to have some interface for providers which would allow them to tune how to do caching. Another option is to do nothing, and handle it on the provider level, or on lower levels.

kmike commented 3 years ago

Example of cache implemented in a provider: https://github.com/scrapinghub/scrapy-autoextract/pull/24

BurnzZ commented 2 years ago

The implementation of this feature is almost finished in https://github.com/scrapinghub/scrapy-poet/pull/55.

kmike commented 2 years ago

Fixed by https://github.com/scrapinghub/scrapy-poet/pull/55.

scrapinghub / scrapy-poet

Caching system #50