bitextor / pdf-extract

PDF parser and converter to HTML
GNU General Public License v3.0
83 stars 14 forks source link

Exception always #19

Closed msdobrescu closed 4 years ago

msdobrescu commented 4 years ago

Hello,

Each time I run it, I get:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapLookup(Z)Lorg/apache/fontbox/ttf/CmapLookup; at org.apache.pdfbox.pdmodel.font.PDCIDFontType2.<init>(PDCIDFontType2.java:145) at org.apache.pdfbox.pdmodel.font.PDCIDFontType2.<init>(PDCIDFontType2.java:62) at org.apache.pdfbox.pdmodel.font.PDFontFactory.createDescendantFont(PDFontFactory.java:125) at org.apache.pdfbox.pdmodel.font.PDType0Font.<init>(PDType0Font.java:192) at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:83) at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:146) at org.fit.pdfdom.PDFBoxTree.processFontResources(PDFBoxTree.java:375) at org.fit.pdfdom.PDFBoxTree.updateFontTable(PDFBoxTree.java:361) at org.fit.pdfdom.PDFDomTree.updateFontTable(PDFDomTree.java:544) at org.fit.pdfdom.PDFBoxTree.processPage(PDFBoxTree.java:206) at org.apache.pdfbox.text.PDFTextStripper.processPages(PDFTextStripper.java:319) at org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:266) at org.fit.pdfdom.PDFDomTree.createDOM(PDFDomTree.java:218) at pdfextract.PDFExtract.convertPdfToHtml(PDFExtract.java:558) at pdfextract.PDFExtract.Extract(PDFExtract.java:245) at Main.main(Main.java:69)

ivopisarovic commented 4 years ago

Same here

oholter commented 4 years ago

It seems to be a dependency conflict. I got it running by changing the pdfbox.jar and fontbox.jar to the newest version from this page. https://pdfbox.apache.org/download.cgi

dionwiggins commented 4 years ago

New version using Poppler makes the PDFBox version no longer relevant. Closing this one.