elixir-crawly / crawly

Crawly, a high-level web crawling & scraping framework for Elixir.
https://hexdocs.pm/crawly
Apache License 2.0
988 stars 116 forks source link

Error: Could not load spiders. #271

Closed 3even closed 7 months ago

3even commented 1 year ago

Followed the quickstart directions in the readme, these are the results.

iex -S mix run -e "Crawly.Engine.start_spider(BooksToScrape)"

Results:

Erlang/OTP 26 [erts-14.0.1] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [jit:ns]

[info] Opening/checking dynamic spiders storage

[debug] Using the following folder to load extra spiders: ./spiders

[error] Could not load spiders: %MatchError{term: {:error, :enoent}}

System:

Elixir  1.14.5-otp-26
Erlang  26.0.1
Ubuntu  22.04.2 LTS
caanmert commented 1 year ago

I have same issue, do you have any update on this?

oltarasenko commented 1 year ago

Hey @3even and @caanmert, the messages above reflect the fact that Crawly tried to load YML spiders from the default folder (which, as I recall, is ./spiders), but because of the fact that it did not find the folder, it just left a debug message about it.

It should not prevent you from any work you plan to do, of course, if you are not trying to use YML spiders as described here: https://www.erlang-solutions.com/blog/effortlessly-extract-data-from-websites-with-crawly-yml/

However, as I see, these messages are confusing for people, so I will hide them when the dynamic spiders folder is not set explicitly.

Sorry for any possible inconveniences.

dogweather commented 1 year ago

I've been meaning to submit a PR for it — https://github.com/elixir-crawly/crawly/discussions/270

oltarasenko commented 1 year ago

@dogweather Looks like it's not enough. Probably error messages (even debug), make people think that Crawly is broken and no spiders will run :(. So I think it's better to hide it completely unless enabled.

oltarasenko commented 7 months ago

I think the debug message we have now, looks quite ok:

12:57:29.967 [info] No spiders found to auto-load: %MatchError{term: {:error, :enoent}}