intbot / ng2-pdfjs-viewer

An angular component for PDFJS and ViewerJS (Supports all versions of angular)
Apache License 2.0
228 stars 113 forks source link

Issue with Inaccurate Text Highlighting in PDF Search Using ng2-pdfjs-viewer - Version 13 #238

Open rajnish21a opened 1 year ago

rajnish21a commented 1 year ago

Configuration:

Web browser and its version: Google Chrome Operating system and its version: Windows 10 and above PDF.js version: latest Is a browser extension: No Application Platform: Angular 13 PWA Issue Description: I am currently using ng2-pdfjs-viewer version 13 within my application, and overall, it has been working smoothly. However, I have encountered a specific issue when searching for text strings, such as "A10," within PDFs generated using the E3 series.

The problem is that while the search and highlighting functionality generally work correctly for "A10," it also highlights some additional, unintended instances, such as "A/10." This behavior is incorrect; it should only highlight "A10" and not variations like "A/10." I've noticed that similar issues occur when there is a space in between text in the PDF, causing the search to highlight unwanted portions of text.

It's worth noting that these issues are not present when using popular PDF readers like Adobe Acrobat. Upon further investigation, I realized that PDF.js, the PDF reader used by ng2-pdfjs-viewer, and other PDF readers interpret text layers differently, which appears to be the root cause of these inconsistencies.

I would greatly appreciate any assistance or guidance on how to address this issue, as it impacts the accuracy of text highlighting within PDFs generated by the E3 series. Unfortunately, I am unable to provide the PDF for reference, but I am eager to work towards a solution to improve the text highlighting accuracy.

Also when Whole Word Search it only highlights A10 but (A10) is also a Whole Word in it is highlighted in any other PDF reader but not here.

I am duly attaching a test pdf.

Thank you for your understanding and support in resolving this matter. Test_PDF.pdf

codehippie1 commented 4 months ago

@rajnish21a Hopefully this will be fixed after the 4.x upgrade. I will post a note here after the release.