aws-samples / amazon-textract-searchable-pdf

Generate searchable pdf documents from scanned documents with Amazon Textract
Other
65 stars 27 forks source link

Issue for some PDFs #12

Open fullstact69 opened 9 months ago

fullstact69 commented 9 months ago

java.lang.IllegalArgumentException: U+2448 ('.notdef') is not available in the font Courier, encoding: WinAnsiEncoding at org.apache.pdfbox.pdmodel.font.PDType1Font.encode(PDType1Font.java:428) at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:337) at org.apache.pdfbox.pdmodel.font.PDFont.getStringWidth(PDFont.java:368) at com.amazon.textract.pdf.PDFDocument.calculateFontSize(PDFDocument.java:60) at com.amazon.textract.pdf.PDFDocument.addPage(PDFDocument.java:114) at DemoPdfFromLocalPdf.run(DemoPdfFromLocalPdf.java:74) at Demo.main(Demo.java:15)

dgvozd commented 4 months ago

hi @fullstact69 Have you managed to resolve this issue? Try to update org.apache.pdfbox to the latest 3.0.2 version