Open mccahan opened 2 years ago
Thanks for the detailed report @mccahan! It looks like the error likely stems from getting the browser frames (which includes iframes) only once, instead of per time iteration on the virtual timeline. I think moving var browserFrames = getBrowserFrames(page.mainFrame());
to inside the marker while loop should solve the underlying issue.
Please let me know if that fixes the issue. Overall, the pull request seems like a good addition regardless (like checking whether a frames is detached, and logging particular URLs of embedded frames), but if the above fixes the issue, I'd prefer to omit the error suppression, since I don't know which other cases can throw that error.
In the longer term, it's probably a good idea to allow for the user to create a custom error handler that can suppress or rethrow errors if need be.
Following up on this, getting the browser frames every capture doesn't seem to work. I'll look into this further.
Appreciate the quick review!
Yeah, confirming that it still returns 2 browser frames (the main frame + the iframe) for 0ms, 16.6ms, and 33.3ms even though the iframe gets removed at 20ms. However, if I do var browserFrames = getBrowserFrames(page.mainFrame()).reverse();
to reverse them (so the main browser frame should be last), it works.
Going forward, it:
goToTimeAndAnimateForCapture
for the main page, which includes removing the iframegoToTimeAndAnimateForCapture
on the iframe, which was just removed, and therefore has an invalid execution contextReversing the frame list and running them in the opposite order here overrides/animates the iframe first and then runs the JS in the main frame that removes the iframe and detaches it, so no error. I'm not sure whether just .reverse()
ing the frames list is a solid fix, though. It might be since a child frame shouldn't be able to remove itself from a parent frame but I'm not confident in that. It could also theoretically cause confusion if people are using window.postMessage
at all and the order switches.
Adding the getBrowserFrames()
call within that loop does solve the issue with detached frames (if I continue to catch the first exception).
Thanks for the speedy reply! I think you're pretty much correct in your understanding, and here's some additional information and breakdown of what I believe is happening:
The error itself seems to be thrown from the frame's ExecutionContext which arises from a response to an evaluate message sent to the Chrome DevTools Protocol.
When timesnap iterates through frames, for efficiency's sake, it doesn't currently wait for the response from puppeteer (and Chrome DevTools Protocol). Since Chrome DevTools Protocols doesn't seem to run immediately, commands to all frames are sent in order, all at once. Chromium then runs those commands in order (and not in parallel as I'd thought), first removing the iframe, and then trying to run the second command, and replying with an error message now that the iframe is removed. That error message is interpreted and then thrown as an error by Puppeteer.
As you note, reversing the iteration through the frames seems to solve the issue. After all, reversing the list would process child frames before their parents, and generally child frames do not remove/navigate their parents or siblings (though I think it's still possible via window.parent
). There may be some unintended consequences with postMessage
, but since parent frames shouldn't have an expectation to post before children frames and timesnap doesn't currently wait for messages when iterating through virtual time, I don't consider this to be that big of an issue. The downside is that there are probably some edge cases that aren't covered (mainly a same-origin frame removing/navigating a sibling or parent).
Alternatively, another fix for this could be to change the iteration to something awaitable (i.e. changing the forEach
iteration to a for
loop), checking to see if a frame is detached (or is part of the frames found in getBrowserFrames
) before evaluating, and then awaiting every evaluate. This seems to work from my testing. The downside to this approach is that it might be less efficient by having to wait for each iframe and main frame to finish evaluating, though I haven't run any in-depth benchmarks.
Catching and logging this error might be also fine, since it seems like it only happens if an execution context is destroyed, which seems to happen only on navigation/removal.
I'll mull over the possibilities.
Hi @mccahan, for the fix, I've decided to await each frame evaluation and check to see whether a frame is detached before evaluating, and is reflected in the commits today. It seems to solve the issue in the reproducible example. Could you try it out to see if you run into this error for your use case? You may need to update/reinstall dependencies if you get some new errors, since the more recent commits use the newest version of timeweb.
That solved it (elegantly!) for my sample page, which I thought was the least common denominator replicable case, but it turns out it didn't fix it for the production page I'm testing against. I'm running against commit 3d16a10, with a fresh rm -rf node_modules && npm i
.
I'm not sure why yet, when I have an img
tag from the same origin and then run html2canvas
against it, I get the same error.
To reproduce, I've got this file (canvas-image.html
):
<html>
<body>
<img src="test-image.png">
<script src="https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.4.1/html2canvas.min.js"></script>
<script>
html2canvas(document.body, { removeContainer: true }).then(canvas => {
document.body.appendChild(canvas)
})
</script>
</body>
</html>
Running:
wget https://via.placeholder.com/150 -O test-image.png
node cli.js canvas-image.html
Output looks like:
Capturing Frame 1 to /Volumes/Projects/timesnap/001.png...
Capturing Frame 2 to /Volumes/Projects/timesnap/002.png...
Capturing Frame 3 to /Volumes/Projects/timesnap/003.png...
[0206/152621.083397:INFO:CONSOLE(20)] "#1 50ms Document cloned, element located at 0,0 with size 800x600 using computed rendering", source: https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.4.1/html2canvas.min.js (20)
[0206/152621.083561:INFO:CONSOLE(20)] "#1 50ms Starting DOM parsing", source: https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.4.1/html2canvas.min.js (20)
[0206/152621.091805:INFO:CONSOLE(20)] "#1 50ms Added image file:///Volumes/Projects/timesnap/test-image.png", source: https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.4.1/html2canvas.min.js (20)
[0206/152621.092511:INFO:CONSOLE(20)] "#1 50ms Starting renderer for element at 0,0 with size 800x600", source: https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.4.1/html2canvas.min.js (20)
[0206/152621.092981:INFO:CONSOLE(20)] "#1 50ms Canvas renderer initialized (800x600) with scale 1", source: https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.4.1/html2canvas.min.js (20)
[0206/152621.098522:INFO:CONSOLE(20)] "#1 50ms Finished rendering", source: https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.4.1/html2canvas.min.js (20)
Error: Execution context was destroyed, most likely because of a navigation.
at rewriteError (/Volumes/Projects/timesnap/node_modules/puppeteer/lib/ExecutionContext.js:167:15)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async ExecutionContext._evaluateInternal (/Volumes/Projects/timesnap/node_modules/puppeteer/lib/ExecutionContext.js:120:56)
at async ExecutionContext.evaluate (/Volumes/Projects/timesnap/node_modules/puppeteer/lib/ExecutionContext.js:48:12)
at async runInAllFrames (/Volumes/Projects/timesnap/lib/utils.js:49:7)
at async run (/Volumes/Projects/timesnap/index.js:253:9)
at async module.exports (/Volumes/Projects/timesnap/index.js:287:5)
-- ASYNC --
at ExecutionContext.<anonymous> (/Volumes/Projects/timesnap/node_modules/puppeteer/lib/helper.js:111:15)
at DOMWorld.evaluate (/Volumes/Projects/timesnap/node_modules/puppeteer/lib/DOMWorld.js:112:20)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
-- ASYNC --
at Frame.<anonymous> (/Volumes/Projects/timesnap/node_modules/puppeteer/lib/helper.js:111:15)
at runInAllFrames (/Volumes/Projects/timesnap/lib/utils.js:49:19)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async run (/Volumes/Projects/timesnap/index.js:253:9)
at async module.exports (/Volumes/Projects/timesnap/index.js:287:5)
node:internal/process/promises:246
triggerUncaughtException(err, true /* fromPromise */);
^
There's a configuration parameter in that HTML file, removeContainer
which I've left to true (which is the default) that removes the iframe after it's been used.
removeContainer: false
, no errorremoveContainer: false
and document.querySelector('iframe').remove()
inside the callback function (so it removes the iframe immediately after it completes, which I thought was my reproducible test case), errorremoveContainer: false
and setTimeout(function(){ document.querySelector('iframe').remove() }, 100);
inside the callback function so it doesn't remove the iframe right away, no errorI do know that html2canvas adds images to the container using a Promise so they load asynchronously, which is the only hunch I've got at the moment for why it might feel like it's getting the execution context rug pulled out from underneath it, but if I try/catch the exception it works as intended (correctly screenshots the image into a canvas and appends it to the document) on the next frame.
Sorry, I thought I'd boiled it down to the minimum replicable test case, since it worked with my version that just catches that exception and moves on.
Describe the bug I have a page that uses html2canvas to "screenshot" part of the page to use as a texture with ThreeJS for a particular effect. When html2canvas runs, it creates an iframe with a copy of the page, takes its screenshot, and then destroys the iframe. Running timecut and timesnap on a page that removes an iframe results in a fatal error as it attempts to execute the time overwriting on an invalid execution context:
On the first captured frame after the iframe is removed, Puppeteer throws the "Execution context was destroyed" exception message. If we catch that exception, a captured frame or two later Puppeteer will throw a different exception that the (browser) frame has been detached:
I do have my own version of a fix at https://github.com/mccahan/timesnap/pull/1/files if you'd like me to Pull Request it - I didn't want to assume where you might want to catch the exception.
Checking the exception message for the string "Execution context was destroyed" feels flimsy, but it mirrors the check that Puppeteer uses internally. I was unable to find a way to determine ahead of time whether the context was destroyed, the only way I could find to test for it is to try and catch an exception afterwards.
I also added in a check to skip processing on (browser) frames that are marked as detached. This negates the second fatal exception.
To Reproduce
remove-iframe.html
with the contents from the Attachments section belownode cli.js remove-iframe.html
Expected behavior Expected it to gracefully handle (intentional) removals of iframes, continuing to capture.
Attachments and Screenshots
Simplest sample recreation page:
Desktop (please complete the following information):
Additional context I hit the same issue in timecut, but it seems this is the appropriate repo to address it?