We are getting below "subprocess" error, when we are running code in container. In local machine, however it is working fine. We had installed Tabula on local machine an year back. Even in container, it was working fine until this week. Attaching PDFs as well for which it is failing. Versions of packages mentioned below. Can it be PDF files although for same version they are running in local machine? or Environments? Although we checked, there has been no update in environments permissions etc.
Package Versions:(llms) dd00740409@ns3067540:~$ java -version openjdk version "1.8.0_312" OpenJDK Runtime Environment (build 1.8.0_312-8u312-b07-0ubuntu1~18.04-b07) OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)
(llms) dd00740409@ns3067540:~$ python Python 3.8.17 | packaged by conda-forge | (default, Jun 16 2023, 07:06:00) [GCC 11.4.0] on linux
Error:subprocess.CalledProcessError: Command '['java', '-Dfile.encoding=UTF8', '-jar', '/usr/local/lib/python3.8/site-packages/tabula/tabula-1.0.5-jar-with-dependencies.jar', '--pages', '9', '--stream', '--guess', '--format', 'JSON', 'Roa8dvYUVmHQLKhhvTiPL.pdf']' returned non-zero exit status 1.
Logs:Exception in thread "main" java.lang.UnsatisfiedLinkError: /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libjavajpeg.so: libjpeg.so.8: cannot open shared object file: No such file or directory at java.lang.ClassLoader$NativeLibrary.load(Native Method) at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1934) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1838) at java.lang.Runtime.loadLibrary0(Runtime.java:843) at java.lang.System.loadLibrary(System.java:1136) at com.sun.imageio.plugins.jpeg.JPEGImageReader$1.run(JPEGImageReader.java:92) at com.sun.imageio.plugins.jpeg.JPEGImageReader$1.run(JPEGImageReader.java:90) at java.security.AccessController.doPrivileged(Native Method) at com.sun.imageio.plugins.jpeg.JPEGImageReader.<clinit>(JPEGImageReader.java:89) at com.sun.imageio.plugins.jpeg.JPEGImageReaderSpi.createReaderInstance(JPEGImageReaderSpi.java:85) at javax.imageio.spi.ImageReaderSpi.createReaderInstance(ImageReaderSpi.java:320) at javax.imageio.ImageIO$ImageReaderIterator.next(ImageIO.java:529) at javax.imageio.ImageIO$ImageReaderIterator.next(ImageIO.java:513) at org.apache.pdfbox.filter.Filter.findImageReader(Filter.java:155) at org.apache.pdfbox.filter.DCTFilter.decode(DCTFilter.java:58) at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:80) at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:175) at org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:243) at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createInputStream(PDImageXObject.java:791) at org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit(SampledImageReader.java:517) at org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage(SampledImageReader.java:226) at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:481) at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:462) at org.apache.pdfbox.rendering.PageDrawer.drawImage(PageDrawer.java:1110) at org.apache.pdfbox.contentstream.operator.graphics.DrawObject.process(DrawObject.java:67) at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:933) at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:514) at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:492) at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:155) at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:277) at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:347) at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:268) at org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:254) at technology.tabula.Utils.pageConvertToImage(Utils.java:285) at technology.tabula.detectors.NurminenDetectionAlgorithm.detect(NurminenDetectionAlgorithm.java:101) at technology.tabula.CommandLineApp$TableExtractor.extractTablesBasic(CommandLineApp.java:421) at technology.tabula.CommandLineApp$TableExtractor.extractTables(CommandLineApp.java:408) at technology.tabula.CommandLineApp.extractFile(CommandLineApp.java:180) at technology.tabula.CommandLineApp.extractFileTables(CommandLineApp.java:124) at technology.tabula.CommandLineApp.extractTables(CommandLineApp.java:106) at technology.tabula.CommandLineApp.main(CommandLineApp.java:76)
We are getting below "subprocess" error, when we are running code in container. In local machine, however it is working fine. We had installed Tabula on local machine an year back. Even in container, it was working fine until this week. Attaching PDFs as well for which it is failing. Versions of packages mentioned below. Can it be PDF files although for same version they are running in local machine? or Environments? Although we checked, there has been no update in environments permissions etc.
PDFs: IONIS Registartion document (002).pdf test_Vinayak.pdf [Uploading Annual_Report.pdf…]()
Package Versions:
(llms) dd00740409@ns3067540:~$ java -version openjdk version "1.8.0_312" OpenJDK Runtime Environment (build 1.8.0_312-8u312-b07-0ubuntu1~18.04-b07) OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)
(llms) dd00740409@ns3067540:~$ python Python 3.8.17 | packaged by conda-forge | (default, Jun 16 2023, 07:06:00) [GCC 11.4.0] on linux
Error:
subprocess.CalledProcessError: Command '['java', '-Dfile.encoding=UTF8', '-jar', '/usr/local/lib/python3.8/site-packages/tabula/tabula-1.0.5-jar-with-dependencies.jar', '--pages', '9', '--stream', '--guess', '--format', 'JSON', 'Roa8dvYUVmHQLKhhvTiPL.pdf']' returned non-zero exit status 1.
Logs:
Exception in thread "main" java.lang.UnsatisfiedLinkError: /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libjavajpeg.so: libjpeg.so.8: cannot open shared object file: No such file or directory at java.lang.ClassLoader$NativeLibrary.load(Native Method) at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1934) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1838) at java.lang.Runtime.loadLibrary0(Runtime.java:843) at java.lang.System.loadLibrary(System.java:1136) at com.sun.imageio.plugins.jpeg.JPEGImageReader$1.run(JPEGImageReader.java:92) at com.sun.imageio.plugins.jpeg.JPEGImageReader$1.run(JPEGImageReader.java:90) at java.security.AccessController.doPrivileged(Native Method) at com.sun.imageio.plugins.jpeg.JPEGImageReader.<clinit>(JPEGImageReader.java:89) at com.sun.imageio.plugins.jpeg.JPEGImageReaderSpi.createReaderInstance(JPEGImageReaderSpi.java:85) at javax.imageio.spi.ImageReaderSpi.createReaderInstance(ImageReaderSpi.java:320) at javax.imageio.ImageIO$ImageReaderIterator.next(ImageIO.java:529) at javax.imageio.ImageIO$ImageReaderIterator.next(ImageIO.java:513) at org.apache.pdfbox.filter.Filter.findImageReader(Filter.java:155) at org.apache.pdfbox.filter.DCTFilter.decode(DCTFilter.java:58) at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:80) at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:175) at org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:243) at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createInputStream(PDImageXObject.java:791) at org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit(SampledImageReader.java:517) at org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage(SampledImageReader.java:226) at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:481) at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.getImage(PDImageXObject.java:462) at org.apache.pdfbox.rendering.PageDrawer.drawImage(PageDrawer.java:1110) at org.apache.pdfbox.contentstream.operator.graphics.DrawObject.process(DrawObject.java:67) at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:933) at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:514) at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:492) at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:155) at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:277) at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:347) at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:268) at org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:254) at technology.tabula.Utils.pageConvertToImage(Utils.java:285) at technology.tabula.detectors.NurminenDetectionAlgorithm.detect(NurminenDetectionAlgorithm.java:101) at technology.tabula.CommandLineApp$TableExtractor.extractTablesBasic(CommandLineApp.java:421) at technology.tabula.CommandLineApp$TableExtractor.extractTables(CommandLineApp.java:408) at technology.tabula.CommandLineApp.extractFile(CommandLineApp.java:180) at technology.tabula.CommandLineApp.extractFileTables(CommandLineApp.java:124) at technology.tabula.CommandLineApp.extractTables(CommandLineApp.java:106) at technology.tabula.CommandLineApp.main(CommandLineApp.java:76)