Possibility of storing more request info to hubstorage.

Currently the task of storing request info to hubstorage is performed by sh_scrapy.extension.HubstorageMiddleware.
As a spider middleware, it catches only responses that are being passed to a spider.

Thus it misses responses that are:

Consumed by a downloader middleware (e.g. RobotsTxtMiddleware, RetryMiddleware, MetaRefreshMiddleware, or RedirectMiddleware).
Consumed by an item pipeline (e.g. ImagesPipeline)
Other responses that do not go through the spider middlewares.

Is there any specific reason that we do need to exclude these responses?

It's possible to use a signal handler for scrapy.signals.response_downloaded to gather more requests. This way we may still need a spider middleware for setting the "_hsparent" field.

scrapinghub / scrapinghub-entrypoint-scrapy

Possibility of storing more request info to hubstorage. #41