Closed ldko closed 2 months ago
This is actually fixed in main / 1.2.0-beta.0, but can backport to upcoming 1.1.4 release. It looks like using the undici
library directly fixes the issue (perhaps newer implementation than in node version?)
Fixed in 1.1.4
I have recently run some crawls that have errored out with the message: "TypeError [ERR_INVALID_STATE]: Invalid state: Controller is already closed". This is being triggered when I try to crawl certain seeds.
The error occurred trying to crawl with any of the following seeds (all respond with redirects, if seeding a crawl with the target location of the redirect, the crawl succeeds):
Reproduce with:
docker run -v $PWD/crawls:/crawls/ -it webrecorder/browsertrix-crawler crawl --url "https://www.bacb.com/wp-content/uploads/2022/01/Ethics-Code-for-Behavior-Analysts-230119-a.pdf" --scopeType page --generateWACZ --text --collection test
Logs and error output displays as: