KevM / tikaondotnet

Use the Java Tika text extraction library on the .NET platform
http://kevm.github.io/tikaondotnet/
Apache License 2.0
195 stars 73 forks source link

not implement error raised when rendering pdf to image #105

Open coader opened 7 years ago

coader commented 7 years ago

at com.sun.imageio.plugins.jpeg.JPEGImageReader.getImageMetadata(Int32 ) at org.apache.pdfbox.filter.DCTFilter.getNumChannels(ImageReader ) at org.apache.pdfbox.filter.DCTFilter.decode(InputStream , OutputStream , COSDictionary , Int32 ) at org.apache.pdfbox.cos.COSInputStream.create(List , COSDictionary , InputStream , ScratchFile ) at org.apache.pdfbox.cos.COSStream.createInputStream() at org.apache.pdfbox.pdmodel.common.PDStream.createInputStream() at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject..ctor(PDStream stream, PDResources resources) at org.apache.pdfbox.pdmodel.graphics.PDXObject.createXObject(COSBase base, PDResources resources) at org.apache.pdfbox.pdmodel.PDResources.getXObject(COSName name) at org.apache.pdfbox.contentstream.operator.graphics.DrawObject.process(Operator operator, List operands) at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(Operator operator, List operands) at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDContentStream ) at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDContentStream ) at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDPage page) at org.apache.pdfbox.rendering.PageDrawer.drawPage(Graphics g, PDRectangle pageSize) at org.apache.pdfbox.rendering.PDFRenderer.renderImage(Int32 pageIndex, Single scale, ImageType imageType) at org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(Int32 pageIndex, Single dpi, ImageType imageType)

so i want to delete all images before render, but not found method: PDFDOcument.getDocumentCatalog().getAllPages();

only getPages and not found getAllImages for PDFResources also.

how to resolve it?

TechnikEmpire commented 7 years ago

This is a failure in IKVM, not Tika or this project. See my comments:

https://github.com/KevM/tikaondotnet/issues/94#issuecomment-296425723

IKVM omits many functions, including those required to render these images.