Open orkhanahmadov opened 10 months ago
@orkhanahmadov From which version did you upgrade? From v1.13.1? Or from something prior?
In this compare view, you can see that between v1.13.1 to v2.0.0 basically nothing changed, besides the new spatie/browsershot:v4.0
requirement and the change to the config file.
I didn't have yet the time to upgrade my own apps to the latest version, so can't speak from experience. Weird that our test suite in the package, that runs on AWS, is green. 🤔
@stefanzweifel ok, turns out it is not related to v2, I guess.
Recently when the timeout happened again, we noticed a CloudWatch log with the following contents:
Is this useful or related?
Thanks for the update @orkhanahmadov. Seems related to a cleanup bit of code: https://github.com/stefanzweifel/sidecar-browsershot/blob/081c3918ad3634146bc0ab2c565d828dd143b518/resources/lambda/browsershot.js#L81
Will create a PR with a fix soonish.
@orkhanahmadov Would you able to share some code snippets for this issue. I'm not able to replicate your timeout issue on my machine or in my production apps.
I don't assume you run any JavaScript in your Blade views? I know this sounds like a not ideal solution, but have you tried increasing the timeout of your function? Does the error still occur?
Maybe this is related to the underlying Puppeteer versions. Will work on upgrading the underlying layer to the latest puppeteer-version.
@stefanzweifel This is a weird issue that happens completely randomly, we have no clue what exactly causes it or how to reproduce it. Because when this timeout happens we try it one more time generating the same PDF and everything works... The last time when timeout happened the only clue we got was that CloudWatch log.
Initially, the timeout was 30 seconds, we tried to increase it to 300 seconds, but it still didn't help. When this timeout happens lambda gets "stuck", no amount of timeout helps.
What we did as a workaround:
LambdaExecutionException
happensuse Hammerstone\Sidecar\Exceptions\LambdaExecutionException;
private int $retry = 0;
public function render(): string
{
try {
return $this->browser
->setHtml($this->html)
->format($this->format)
->margins(...$this->margins)
->pdf();
} catch (LambdaExecutionException $exception) {
if ($this->retry < 4) { // 5 times in total, including the first attempt
$this->retry++;
return $this->render();
}
throw $exception;
}
}
I suspect maybe it is related to the underlying library... The CloudWatch log says the deprecation warning is related to fs.rmdir
, not fs.rmdirSync
which is package is using. Maybe we can try node --trace-deprecation
on the layer. There are some reported and closed issues on the puppeteer repository: https://github.com/puppeteer/puppeteer/issues?q=is%3Aissue+rmdir
I also found reported similar issues related to puppeteer-extra
, but I believe this library is not being used here.
@orkhanahmadov Thanks! Lowering the timeout makes total sense here. Better fail fast than wait for forever and produce unnecessary cost.
Will do some research. 🤓
As you might have seen, my attempt (#112) at updating our internal code to no longer use fs.rmdir
was not successful. Have to figure out, what the root issue here is.
In the meantime, I've updated an underlying layer to use the latest puppeteer-core version and updated this package to also use an updated Chromium version: https://github.com/stefanzweifel/sidecar-browsershot/releases/tag/v2.1.0
Can't guarantee that this will solve the issues.
Hello @orkhanahmadov, did you find a solution to this problem?
This is happening to me too.
For the same site: sometimes it times out after 300s or net::ERR_TUNNEL_CONNECTION_FAILED
This happens to all sites, if I take a screenshot several times, many fail with this problem
obs: I set it to always try 5 times, and it often failed all 5 times.
My setup: "wnx/sidecar-browsershot": "^2.3" sidecar-browsershot-layer: 2 chrome-aws-lambda: 42 AWS Lambda region: us-east-1 (N. Virginia)
We never had this issue with v1, but since upgrading to v2, the following exception getting thrown randomly when trying to generate a PDF:
I saw another reported issue with #100, but that one seems to be related to protocol timeout, this one is related to Lambda's 30-second timeout. Also unlike #100, we don't try to generate huge PDFs. It is a single-page PDF page and it doesn't always fail but is completely random. Fails once then trying again it succeeds.
Any clues? Anything in v2 that can make lambda substantially longer to execute?