elixir-crawly / crawly

Crawly, a high-level web crawling & scraping framework for Elixir.
https://hexdocs.pm/crawly
Apache License 2.0
965 stars 114 forks source link

Simplify manager init function #161

Closed oltarasenko closed 3 years ago

oltarasenko commented 3 years ago

Hey @Ziinc I have simplified the handle_continue function a bit (e.g. combined operations for requests and URLs). Hopefully, that looks ok to you.

oltarasenko commented 3 years ago

Another problem I am thinking of is that if we have a huge list of links it may be the case that we will block the requests storage with store operations. So it will be almost impossible to do the pop :(. Maybe we should add a small delay to the async operation, just in case, so it's possible to do something with workers?

oltarasenko commented 3 years ago

@Ziinc I have addressed the issues you have pointed out. (Mainly moved all requests back to the handle_continue)