Open Revertron opened 2 hours ago
Hi, what version of spider are you using? The select will perform both of the tasks at the same time concurrently independent of each other. There was a bug in an older version. Make sure to use v2.13.76
and above (currently v2.13.79
).
Tested it locally and the delay works across all features. The issue you were facing most likely was from the example redirecting to example.com. Updated the default target url to fix this.
I've tried the latest version, from master. And I didn't use urls in examples.
Why did you close the issue?
From the docs of tokio:
The tokio::select! macro allows waiting on multiple async computations and returns when a single computation completes.
Did you try big delay, like 1000?
Yes, tested this with a delay of one second. The select is cancel safe because it uses a join_next from a JoinSet.
Ran it again to make sure. Going to see what happened the first time it only found 2 links. The second time it respected the delay. Thanks for the issue. Will take a look in the morning. The delay disables the concurrency so this should be something straight forward to fix.
https://github.com/user-attachments/assets/9d14bd92-4ef4-46e5-bd0f-66abde69f622
It seems that the
website.crawl().await
is not working at all when using delays like 1000 (one second). Maybe this is the cause of #225 also.And the culprit of this is the
select!()
on this line: https://github.com/spider-rs/spider/blob/main/spider/src/website.rs#L2227.The stream is throttled, so the select!() is always selecting other tasks, and never fetches any links.