elixir-crawly / crawly

Crawly, a high-level web crawling & scraping framework for Elixir.
https://hexdocs.pm/crawly
Apache License 2.0
988 stars 116 forks source link

Fix issue in the spider generator #267

Closed Rukomoynikov closed 1 year ago

Rukomoynikov commented 1 year ago

Hello all.

I made this small PR to address the issue in the spider generator. From the documentation, I found that to generate a template for a spider, I should use:

mix crawly.gen.spider --filepath ./lib/crawly_example/books_to_scrape.ex --spidername BooksToScrape

But if I change the order of parameters it will not work. The problem is that OptParser returns parsed arguments in a key list - [filepath: filepath, spidername: spidername]. So, for the function defp generate_spider(filepath: filepath, spidername: spidername), it's necessary to pass it in this order (filepath first etc). If it's passed in another order, like spidername first, it fails. The proposed solution is to convert the key lists type into a Map.

It could be solved without converting to a Map, adding a function:

defp generate_spider(spidername: spidername, filepath: filepath) do
  generate_spider(filepath: filepath, spidername: spidername)
end
# What if, in the future, more arguments will be added?

Changes in the PR

  1. Fixed issue with the generator
  2. Handling exceptions when a file with the spider can't be saved
  3. Tests for spider generator
Rukomoynikov commented 1 year ago

Hi @oltarasenko

Could you please, when you have time, look at the PR? I can adjust it if you feel it doesn't fulfil its goal or remove it if the issue is not crucial.

oltarasenko commented 1 year ago

Hey @Rukomoynikov,

Thanks for contributing, and sorry for a bit late reply, I have just returned from vacation. Will look at the PR soon.