elixir-crawly / crawly

Crawly, a high-level web crawling & scraping framework for Elixir.
https://hexdocs.pm/crawly
Apache License 2.0
982 stars 115 forks source link

Feature request: make request storage configurable and pluggable #145

Open tanguilp opened 3 years ago

tanguilp commented 3 years ago

As far as I understand, requests are necessarily stored in a GenServer's state (Crawly.RequestStorageWorker) and it is not possible to plug its own storage module.

This has some disadvantages:

It would be nice if this module was configurable. That would make backends such as Mnesia or SQL databases possible, enabling distributed crawling.

oltarasenko commented 3 years ago

Yes, it's absolutely true. We had plans both for Request/Items storages to be able to share and recover states... however it turned out that for now, we did not have that demand. At least not yet, so this part of the work was abandoned for a while.