spider-rs / spider

A web crawler and scraper for Rust
https://spider.cloud
MIT License
1.16k stars 100 forks source link

Support COOKIE during the crawl [ENHANCEMENT] #186

Closed Zabrane closed 5 months ago

Zabrane commented 5 months ago

Hi @j-mendez

I'm using different tools (NodeJS, Go) to preload my blog website. They are all slow and none compares to spider in term of speed.

However, one thing is still missing to spider. The ability to support cookie HTTP header during the crawl. Is there any plan for this?

Thanks

j-mendez commented 5 months ago

Hi @j-mendez

I'm using different tools (NodeJS, Go) to preload my blog website. They are all slow and none compares to spider in term of speed.

However, one thing is still missing to spider. The ability to support cookie HTTP header during the crawl. Is there any plan for this?

Thanks

Hi, setting the cookie can be done using configuration option cookie_str or website.with_cookies.

Zabrane commented 5 months ago

@j-mendez This wasn't my point. A cookie is send by the server to the client (spider). The cookie is then sent back to the server by the client (spider) every time a request is made.

Is this already implemented?

j-mendez commented 5 months ago

@j-mendez This wasn't my point. A cookie is send by the server to the client (spider).

The cookie is then sent back to the server by the client (spider) every time a request is made.

Is this already implemented?

Yes this is handled by the "cookies" feature flag by default.

Zabrane commented 5 months ago

Yes this is handled by the "cookies" feature flag by default.

@j-mendez So it's by default. No need to enable anything. Very good. Please let me know when the --depth option is fixed so I can test it extensively.