huridocs / uwazi

Uwazi is a web-based, open-source solution for building and sharing document collections
http://www.uwazi.io
MIT License
242 stars 80 forks source link

[IX] - PDF sidepanel oversized documents #7455

Open Zasa-san opened 2 days ago

Zasa-san commented 2 days ago

@aphilop @konzz we seem to have a new problem related to the size of the PDF on the side panel, so we may need to promote this issue to high priority.

Now the document is rendered oversized:

image

Tested on FF and Chrome

Originally posted by @txau in https://github.com/huridocs/uwazi/issues/7393#issuecomment-2483650498

Zasa-san commented 2 days ago

@konzz The problem seems to be with app/react/V2/Components/PDFViewer/PDFPage.tsx, when rendering the pdf page:

   const pageViewer = new PDFJSViewer.PDFPageView({
          container: currentContainer,
          id: page,
          scale: 1.1,
          defaultViewport,
          annotationMode: 0,
          eventBus: new EventBus(),
        });

The scale has been set to 1.1. This is causing the bigger size on the pdf sidepanel, and the distortion of the text selection.

When rending a pdf in the library we still use the old pdfjs component that render the pdf with a scale of 1.

I think that that discrepancy in scales between components is breaking the selections and making the pdf bigger. I don't see an immediate reason as to why the pdf sidepanel's scale was set to 1.1. The scale should be kept at 1 in my opinion, or at least always match both in pdf sidepanel and entity view.

The pdf sidepanel e2e test is done as part of cypress/e2e/settings/information-extraction.cy.ts. It's checking that the PDF renders, and the selections exist. But there is no image snapshot.

There are other elements of app/react/V2/Components/PDFViewer/PDFPage.tsx & app/react/V2/Components/PDFViewer/PDF.tsx that this e2e is not testing, and likely would not be able to test (like pdfjs security isEvalSupported feature been turned off when instantiating the component). This components where made when we were trying to favor e2e over unit tests. I think that we should add some unit tests to this components to cover the more nuanced characteristics that are difficult or not possible to test with e2e.

This issue and #7393 should be fixed by #7456

txau commented 2 days ago

@Zasa-san the best way to test it is to play with the browser zoom scale, since PDF rendering and scaling varies depending on the screen resolution.

Zasa-san commented 10 hours ago

After some testing this seems to affect primarily users who are using some kind of zoom factor either in the browser or the OS display settings.

This is unrelated to https://github.com/huridocs/uwazi/issues/7393 and should not affect selections.

I'm attempting to better handle pdfjs's rendering of pages to solve this zoom problem.