Closed timvandermeij closed 2 weeks ago
Good news: I have managed to find a way to reproduce the issue locally on Linux. Bad news: even with dumpio
set nothing is logged :(
To reproduce the issue:
[
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true},
{"id": "pr12564", "file": "pdfs/pr12564.pdf", "md5": "24a19949a2541b960363832cf141f2f2", "rounds": 1, "type": "eq", "annotations": true}
]
dumpio: true
in the Puppeteer options.npx gulp botbrowsertest
.So far I've seen at least one occurrence of the error in every run except for one, so it might need a few tries before you hit it, but then you'll see this:
TEST-PASS | eq test pr12564 | in firefox
TEST-UNEXPECTED-FAIL | test failed pr12564 | in chrome | page1 round 1 | render : Error: Requesting object that isn't resolved yet img_p0_1
Sadly there is no more debugging information from dumpio: true
to help us, but now that we can reproduce it somewhat consistently (probably as consistent as it'll get given the extremely intermittent nature) we can probably instrument PDF.js or Puppeteer with more logging.
I think we're getting closer now. I have checked out to commit 53dfb5a6baebe7ceff49f702366ae55963cd4f17 which is the one right before the DecompressionStream
commit, and having run the steps above 5 times (so 250 runs of the document) resulted in the error not appearing at all. However, if I then check out commit 9654ad570aa3a36e6a507f3e821aacb763d94b31, which is the DecompressionStream
commit, the error is triggered in the very first run already. This does seem to point at commit 9654ad570aa3a36e6a507f3e821aacb763d94b31 somehow causing this. I have checked that no exceptions are thrown by DecompressionStream
itself, so we don't get into https://github.com/mozilla/pdf.js/commit/9654ad570aa3a36e6a507f3e821aacb763d94b31#diff-41c1577a4242a139eb7b15bace590444248dd2a8b16b24a80d5b8168d7098cafR182, so I guess it must be something that surrounds this code...
Hopefully the above, in combination with the local reproducer, helps to debug this further since as this point I'm not really sure how to proceed here.
So the data returned by the DecompressionStream or by our own decompressor are the same.
Interestingly, when adding few console.log
everywhere with some array inside increased the likeliness to have a failure.
Finally in just doing this stupid thing:
diff --git a/src/core/evaluator.js b/src/core/evaluator.js
index 5e31d0d92..4d03a9193 100644
--- a/src/core/evaluator.js
+++ b/src/core/evaluator.js
@@ -857,6 +857,10 @@ class PartialEvaluator {
if (cacheGlobally) {
this.globalImageCache.addByteSize(imageRef, imgData.dataLen);
}
+
+ if (objId === "img_p0_15") {
+ await new Promise(r => setTimeout(r, 100));
+ }
return this._sendImgData(objId, imgData, cacheGlobally);
})
.catch(reason => {
I manage to reproduce the issue with either Chrome or Firefox.
Replace the test manifest with the following copy that simply runs the PDF file 50 times:
FYI: Note that there's an easy way to repeat a test any number of times, by setting "rounds": 50,
(or any other value) instead of copy-pasting the test-entry itself.
Finally in just doing this stupid thing:
That diff seems to reproduce even without commit 9654ad570aa3a36e6a507f3e821aacb763d94b31, so there's probably some older intermittent issue where we're not correctly awaiting dependencies in src/display/canvas.js
.
Yep, my explanation is the following (I'm not sure it's correct):
This way the dependency will be added only when the draw is done but the draw requires the dependency ! The decompression stream patch probably just changed the timing and revealed this bug.
The question is why it doesn't work, since this code should already ensure that we wait for the dependency to resolve: https://github.com/mozilla/pdf.js/blob/06800cd966111f7262d3334dfb68c3765bc92740/src/display/canvas.js#L969-L978
Because the main thread receives the dependency when it's occupied to draw.
Maybe we could just send the dependency before sending the paintImageXObject
. Wdyt ?
Because the main thread receives the dependency when it's occupied to draw.
Yes, but the existing code in src/display/canvas.js
, as linked above, should already be able to handle that case by waiting for the dependency before drawing starts.
Maybe we could just send the dependency before sending the
paintImageXObject
. Wdyt ?
Sorry, but that sounds like a hack and we really need to figure out and fix the "real" error here.
I don't think it's a hack. We send the draw instruction: https://github.com/mozilla/pdf.js/blob/06800cd966111f7262d3334dfb68c3765bc92740/src/core/evaluator.js#L798 and then we send the dependency few lines after: https://github.com/mozilla/pdf.js/blob/06800cd966111f7262d3334dfb68c3765bc92740/src/core/evaluator.js#L868
and in the main thread we're executing the draw instruction, but the dependency message is still waiting in worker message queue. From my pov, the dependency should be always sent before the drawing instruction in order to make sure it's here before using it.
All the message queue is drained before the execution of micro/macro tasks in the main thread. So if the messages with draw instruction and dependency are in the same chunk, whatever the order is, then they will be treated before drawing, consequently the dependency is there when the paintImage wants to get it. But if they aren't in the same chunk, then drawing is executed before the dependency message has been treated.
From my pov, the dependency should be always sent before the drawing instruction in order to make sure it's here before using it.
But then you block further worker-thread parsing on the image-data being completely decoded, and the point of the canvas.js
code is to not have to do that.
Besides, I now think that I know what the "real" culprit is and I've got a tentative patch...
Edit: Basically, this is my fault; sorry about that!
Thank you both for helping out with this! I like how all the puzzle pieces combined helped to solve the issue really quickly, since given how intermittent this was I was initially a bit afraid that this was going to be a time-consuming one to track down.
In PR #18257 we have seen the following intermittent failure in the reference tests:
It looks like this happened only once before, in PR #18167, which may or may not be where this problem started. We should try to reproduce the issue locally. One idea is to repeatedly open/close the file until it happens so we can hopefully extract some debugging information to find out what goes wrong here.