radkovo / Pdf2Dom

Pdf2Dom is a PDF parser that converts the documents to a HTML DOM representation. The obtained DOM tree may be then serialized to a HTML file or further processed. A command-line utility for converting the PDF documents to HTML is included in the distribution package. Pdf2Dom may be also used as an independent Java library with a standard DOM interface for your DOM-based applications or as an alternative parser for the CSSBox rendering engine in order to add the PDF processing capability to CSSBox. Pdf2Dom is based on the Apache PDFBox™ library.
http://cssbox.sourceforge.net/pdf2dom/
GNU Lesser General Public License v3.0
175 stars 71 forks source link

org.mabb.fontverter.io.DataTypeSerializerException #25

Open 99937272 opened 6 years ago

99937272 commented 6 years ago

信息: or call System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider") org.mabb.fontverter.io.DataTypeSerializerException: org.mabb.fontverter.io.DataTypeSerializerException: org.mabb.fontverter.opentype.TtfGlyph at org.mabb.fontverter.io.DataTypeBindingDeserializer.deserialize(DataTypeBindingDeserializer.java:47) at org.mabb.fontverter.opentype.TtfGlyph.parse(TtfGlyph.java:80) at org.mabb.fontverter.opentype.GlyphTable.readData(GlyphTable.java:74) at org.mabb.fontverter.opentype.OpenTypeParser.readTableDataEntries(OpenTypeParser.java:75) at org.mabb.fontverter.opentype.OpenTypeParser.parse(OpenTypeParser.java:47) at org.mabb.fontverter.opentype.OpenTypeParser.parse(OpenTypeParser.java:35) at org.mabb.fontverter.converter.PsType0ToOpenTypeConverter.getOtfFromDescendantFont(PsType0ToOpenTypeConverter.java:64) at org.mabb.fontverter.converter.PsType0ToOpenTypeConverter.convert(PsType0ToOpenTypeConverter.java:43) at org.mabb.fontverter.pdf.PdfFontExtractor.convertType0FontToOpenType(PdfFontExtractor.java:215) at org.fit.pdfdom.FontTable$Entry.loadType0TtfDescendantFont(FontTable.java:192) at org.fit.pdfdom.FontTable$Entry.getData(FontTable.java:145) at org.fit.pdfdom.FontTable$Entry.isEntryValid(FontTable.java:161) at org.fit.pdfdom.FontTable.addEntry(FontTable.java:48) at org.fit.pdfdom.PDFBoxTree.processFontResources(PDFBoxTree.java:385) at org.fit.pdfdom.PDFBoxTree.updateFontTable(PDFBoxTree.java:361) at org.fit.pdfdom.PDFDomTree.updateFontTable(PDFDomTree.java:544) at org.fit.pdfdom.PDFBoxTree.processPage(PDFBoxTree.java:206) at org.apache.pdfbox.text.PDFTextStripper.processPages(PDFTextStripper.java:319) at org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:266) at org.fit.pdfdom.PDFDomTree.createDOM(PDFDomTree.java:218) at org.fit.pdfdom.PDFDomTree.writeText(PDFDomTree.java:194) at org.fit.pdfdom.TestUtils.parseWithPdfDomTree(TestUtils.java:54) at org.fit.pdfdom.TestUtils.parseWithPdfDomTree(TestUtils.java:31) at org.fit.pdfdom.TestUtils.parseWithPdfDomTree(TestUtils.java:22) at org.fit.pdfdom.TestBook.testBook(TestBook.java:44) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) Caused by: org.mabb.fontverter.io.DataTypeSerializerException: org.mabb.fontverter.opentype.TtfGlyph at org.mabb.fontverter.io.DataTypeBindingDeserializer.deserialize(DataTypeBindingDeserializer.java:71) at org.mabb.fontverter.io.DataTypeBindingDeserializer.deserialize(DataTypeBindingDeserializer.java:45) ... 46 more Caused by: org.mabb.fontverter.io.DataTypeSerializerException: int org.mabb.fontverter.opentype.TtfGlyph.instructionLength org.mabb.fontverter.opentype.TtfGlyph at org.mabb.fontverter.io.DataTypeBindingDeserializer.deserialize(DataTypeBindingDeserializer.java:65) ... 47 more Caused by: java.io.EOFException at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) at org.mabb.fontverter.io.DataTypeBindingDeserializer.readSingleValue(DataTypeBindingDeserializer.java:105) at org.mabb.fontverter.io.DataTypeBindingDeserializer.deserializeProperty(DataTypeBindingDeserializer.java:92) at org.mabb.fontverter.io.DataTypeBindingDeserializer.deserialize(DataTypeBindingDeserializer.java:63) ... 47 more [main] WARN Error loading type 0 with ttf descendant font 'FDGONI+SimHei' Message: org.mabb.fontverter.io.DataTypeSerializerException: Error serializing property: java.lang.Long[] org.mabb.fontverter.opentype.GlyphLocationTable.longOffsets class java.io.IOException org.mabb.fontverter.io.DataTypeSerializerException: org.mabb.fontverter.io.DataTypeSerializerException: org.mabb.fontverter.opentype.TtfGlyph at org.mabb.fontverter.io.DataTypeBindingDeserializer.deserialize(DataTypeBindingDeserializer.java:47) at org.mabb.fontverter.opentype.TtfGlyph.parse(TtfGlyph.java:80) at org.mabb.fontverter.opentype.GlyphTable.readData(GlyphTable.java:74) at org.mabb.fontverter.opentype.OpenTypeParser.readTableDataEntries(OpenTypeParser.java:75) at org.mabb.fontverter.opentype.OpenTypeParser.parse(OpenTypeParser.java:47) at org.mabb.fontverter.opentype.OpenTypeParser.parse(OpenTypeParser.java:35) at org.mabb.fontverter.converter.PsType0ToOpenTypeConverter.getOtfFromDescendantFont(PsType0ToOpenTypeConverter.java:64) at org.mabb.fontverter.converter.PsType0ToOpenTypeConverter.convert(PsType0ToOpenTypeConverter.java:43) at org.mabb.fontverter.pdf.PdfFontExtractor.convertType0FontToOpenType(PdfFontExtractor.java:215) at org.fit.pdfdom.FontTable$Entry.loadType0TtfDescendantFont(FontTable.java:192) at org.fit.pdfdom.FontTable$Entry.getData(FontTable.java:145) at org.fit.pdfdom.FontTable$Entry.isEntryValid(FontTable.java:161) at org.fit.pdfdom.FontTable.addEntry(FontTable.java:48) at org.fit.pdfdom.PDFBoxTree.processFontResources(PDFBoxTree.java:385) at org.fit.pdfdom.PDFBoxTree.updateFontTable(PDFBoxTree.java:361) at org.fit.pdfdom.PDFDomTree.updateFontTable(PDFDomTree.java:544) at org.fit.pdfdom.PDFBoxTree.processPage(PDFBoxTree.java:206) at org.apache.pdfbox.text.PDFTextStripper.processPages(PDFTextStripper.java:319) at org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:266) at org.fit.pdfdom.PDFDomTree.createDOM(PDFDomTree.java:218) at org.fit.pdfdom.PDFDomTree.writeText(PDFDomTree.java:194) at org.fit.pdfdom.TestUtils.parseWithPdfDomTree(TestUtils.java:54) at org.fit.pdfdom.TestUtils.parseWithPdfDomTree(TestUtils.java:31) at org.fit.pdfdom.TestUtils.parseWithPdfDomTree(TestUtils.java:22) at org.fit.pdfdom.TestBook.testBook(TestBook.java:44) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) Caused by: org.mabb.fontverter.io.DataTypeSerializerException: org.mabb.fontverter.opentype.TtfGlyph at org.mabb.fontverter.io.DataTypeBindingDeserializer.deserialize(DataTypeBindingDeserializer.java:71) at org.mabb.fontverter.io.DataTypeBindingDeserializer.deserialize(DataTypeBindingDeserializer.java:45) ... 46 more Caused by: org.mabb.fontverter.io.DataTypeSerializerException: int org.mabb.fontverter.opentype.TtfGlyph.instructionLength org.mabb.fontverter.opentype.TtfGlyph at org.mabb.fontverter.io.DataTypeBindingDeserializer.deserialize(DataTypeBindingDeserializer.java:65) ... 47 more Caused by: java.io.EOFException at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) at org.mabb.fontverter.io.DataTypeBindingDeserializer.readSingleValue(DataTypeBindingDeserializer.java:105) at org.mabb.fontverter.io.DataTypeBindingDeserializer.deserializeProperty(DataTypeBindingDeserializer.java:92) at org.mabb.fontverter.io.DataTypeBindingDeserializer.deserialize(DataTypeBindingDeserializer.java:63) ... 47 more [main] WARN Error loading type 0 with ttf descendant font 'FDGOPI+SimSun' Message: org.mabb.fontverter.io.DataTypeSerializerException: Error serializing property: java.lang.Long[] org.mabb.fontverter.opentype.GlyphLocationTable.longOffsets class java.io.IOException 五月 24, 2018 9:12:15 上午 org.apache.pdfbox.pdmodel.font.PDType0Font toUnicode 警告: No Unicode mapping for CID+0 (0) in font FDICLC+Dotum

aino-gautam commented 3 years ago

looks like a font file could be missing