DanielArnould / react-pdf-highlighter-extended

📄 Set of modern React components for PDF highlighting
https://danielarnould.github.io/react-pdf-highlighter-extended/example-app/
MIT License
43 stars 16 forks source link

Fix onProgress race in Chunked PDF Loading #1

Closed orausch closed 6 months ago

orausch commented 7 months ago

Thanks for the great fork.

I ran into a race on pdf loading, the bad interleavings are something like:

{loaded: 0, total: 2471723}
{loaded: 65536, total: 2471723}
{loaded: 196608, total: 2471723}
{loaded: 458752, total: 2471723}
{loaded: 589824, total: 2471723}
{loaded: 917504, total: 2471723}
{loaded: 1179648, total: 2471723}
{loaded: 1310720, total: 2471723}
{loaded: 1572864, total: 2471723}
{loaded: 1703936, total: 2471723}
{loaded: 1900544, total: 2471723}
{loaded: 2031616, total: 2471723}
{loaded: 2228224, total: 2471723}
{loaded: 2293760, total: 2471723}
{loaded: 2359296, total: 2471723}
Done loading data {length: 2471723}
Setting to null
{loaded: 2490368, total: 2471723}
{loaded: 2490368, total: 2471723}

The Setting to null log line is right before this line: https://github.com/orausch/react-pdf-highlighter-extended/blob/main/src/components/PdfLoader.tsx#L85

The loaded printouts are just the onProgress hook.

Notice that the onProgress hook gets called again afterwards, which sets the UI back into the loading state, and freezes it there. Interestingly it's greater than the total, which I traced back to this line in the pdfjs worker: https://github.com/mozilla/pdf.js/blob/29faa38dd78d319ac99ab35aed7e659f3b070f4f/src/core/chunked_stream.js#L530

DanielArnould commented 6 months ago

This is really strange behaviour. I wasn't able to replicate it and some of the guard rails in ChunkedStream should have prevented this, but I don't mind adding in your change as a sanity check.

Only change I made was >= to >, so we can include 100% progress directly. I'm not aware of any duplicate interleavings, so hopefully this still works.

Thanks for the contribution!