spider-rs / spider

A web crawler and scraper for Rust
https://spider.cloud
MIT License
1.16k stars 100 forks source link

Panic with non ASCII string #216

Closed ronanM closed 1 month ago

ronanM commented 1 month ago

Use of slice (a_value[start..end]) panic with "not a char boundary" for UTF-8 string.

spider/spider/src/page.rs:207:58:
byte index 1 is not a char boundary; it is inside 'ק' (bytes 0..2) of `קניית-דירה`
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::str::slice_error_fail_rt
   3: core::str::slice_error_fail
   4: spider::website::Website::crawl_concurrent_raw::{{closure}}::{{closure}}::{{closure}}
   5: spider::website::run_task::{{closure}}
j-mendez commented 1 month ago

Hi @ronanM upgrade to 2.8.14