Open prestonvanloon opened 1 month ago
https://github.com/webrecorder/browsertrix/issues/1137 might be a solution, if that feature request was implemented.
After a bit of reverse engineering, I found an undocumented s3 field access_endpoint_url
. With this, I was able to set the endpoint_url to a http://$IP:$PORT such that the DNS does not need to be resolved for uploading WACZ files. Of course, this is incompatible with replays since it is not HTTPS and it's not feasible to obtain a SSL cert for a non-public $IP:$PORT. Then I found access_endpoint_url
which I was able to set with the domain name https://domain:port/bucket
and this is a sufficient workaround for me.
I think there should be more than 1 attempt to upload the WACZ and if an upload of WACZ ultimately fails, then abort the rest of the crawl since the crawl data is lost.
Yes, the access_endpoint_url is designed for something like this. It would be odd that the minio instance is not being found, while the crawler is able to run
Re: dns issue, I'd be surprised if its anything related to resource exhausition - the upload happens when the browser is already shut down generally. Can the crawler find the DNS when it starts running? You can exec in the crawler and see if it can reach the minio node. Probably what we should do is check that the upload endpoint is available when starting the crawl, and fail immediately it is not - we'll probably add this (in the crawler repo).
I believe the crawler pod should be retrying a few times, so it should be retrying automatically - likely the DNS issue is not resolved, so it'll keep failing.
Browsertrix Version
v1.11.7-7a61568
What did you expect to happen? What happened instead?
I am having some DNS issues, probably from resource exhaustion. (Also filed #2094 to allow cpu_limits on crawler)
When I see this error, the entire crawl is lost and that is frustrating when the crawl has run for 24 hours. I wish that the WACZ upload was attempted multiple times until the upload eventually completes or some threshold is met.
Reproduction instructions
Not sure. I'm using kind
0.24.0
. The cluster conflg is standard, just opens the nodeport.I'm using an external minio s3 instance. The minio s3 instance has to be behind HTTPS for replays to work, so I cannot provide the IP address.
Screenshots / Video
No response
Environment
No response
Additional details
I've tried every workaround that I could imagine.