MyMonsterCat / RapidOcr-Java

🔥🔥🔥Java代码实现调用RapidOCR(基于PaddleOCR),适配Mac、Win、Linux,支持最新PP-OCRv4
Apache License 2.0
212 stars 29 forks source link

Docker 镜像测试 #7

Open nn200433 opened 8 months ago

nn200433 commented 8 months ago

重构后直接引入大佬的发布到中央仓的 jar ,终于能跑成功一个了😂

Docker 镜像:nn200433/tika-server

Docker 容器

root@6a1aad1d4bf6:/# cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

ONNX

  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::               (v2.7.16)

2023-11-17 08:44:10.416 INFO  [main] [TRACE_ID:] [StartupInfoLogger.java:55] : Starting TikaApiApplication v0.1 using Java 1.8.0_342 on 6a1aad1d4bf6 with PID 6 (/home/app.jar started by root in /) 
2023-11-17 08:44:10.426 DEBUG [main] [TRACE_ID:] [StartupInfoLogger.java:56] : Running with Spring Boot v2.7.16, Spring v5.3.30 
2023-11-17 08:44:10.427 INFO  [main] [TRACE_ID:] [SpringApplication.java:631] : No active profile set, falling back to 1 default profile: "default" 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/app.jar!/BOOT-INF/lib/log4j-slf4j-impl-2.20.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/app.jar!/BOOT-INF/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2023-11-17 08:44:13.027 INFO  [main] [TRACE_ID:] [StartupInfoLogger.java:61] : Started TikaApiApplication in 3.382 seconds (JVM running for 4.885) 
2023-11-17 08:44:13.030 INFO  [main] [TRACE_ID:] [InitListener.java:28] : ---> 项目启动完毕, 准备初始化... 
2023-11-17 08:44:13.130 INFO  [main] [TRACE_ID:] [PaddleOCRParser.java:81] : ---> 当前 使用模型 ONNX 
2023-11-17 08:44:14.583 INFO  [main] [TRACE_ID:] [InitListener.java:31] : ---> 初始化完成.... 
2023-11-17 08:46:37.995 DEBUG [tika-1] [TRACE_ID:] [AttachmentServiceImpl.java:102] : ---> 上文文件路径:/tmp/tika/upload/20231117084637-7674962236962559383.docx 
2023-11-17 08:46:38.726 WARN  [tika-1] [TRACE_ID:] [ResourceUtils.java:99] : Couldn't get resource: docx4j.properties 
2023-11-17 08:46:38.726 WARN  [tika-1] [TRACE_ID:] [Docx4jProperties.java:24] : Couldn't find/read docx4j.properties; docx4j.properties not found via classloader. 
2023-11-17 08:46:43.684 WARN  [tika-1] [TRACE_ID:] [XmlUtils.java:214] : actual SAXParserFactory: com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl 
2023-11-17 08:46:43.684 WARN  [tika-1] [TRACE_ID:] [XmlUtils.java:269] : actual DocumentBuilderFactory: com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl 
2023-11-17 08:46:43.848 DEBUG [tika-1] [TRACE_ID:] [Docx4jUtil.java:120] : ---> fileName = /tmp/tika/docx4j-images/image1.png mimeType = image/png PartName = /word/media/image1.png 
2023-11-17 08:46:43.853 DEBUG [tika-1] [TRACE_ID:] [Docx4jUtil.java:120] : ---> fileName = /tmp/tika/docx4j-images/image2.png mimeType = image/png PartName = /word/media/image2.png 
OpenJDK 64-Bit Server VM warning: You have loaded library /tmpocrJava/onnx/libRapidOcr.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
2023-11-17 08:46:45.970 DEBUG [tika-1] [TRACE_ID:] [AttachmentServiceImpl.java:106] : ---> Tika 解析完成,即将返回...... 

NCNN

  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::               (v2.7.16)

2023-11-17 08:50:03.659 INFO  [main] [TRACE_ID:] [StartupInfoLogger.java:55] : Starting TikaApiApplication v0.1 using Java 1.8.0_342 on 9bd70532e16f with PID 6 (/home/app.jar started by root in /) 
2023-11-17 08:50:03.669 DEBUG [main] [TRACE_ID:] [StartupInfoLogger.java:56] : Running with Spring Boot v2.7.16, Spring v5.3.30 
2023-11-17 08:50:03.670 INFO  [main] [TRACE_ID:] [SpringApplication.java:631] : No active profile set, falling back to 1 default profile: "default" 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/app.jar!/BOOT-INF/lib/log4j-slf4j-impl-2.20.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/app.jar!/BOOT-INF/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2023-11-17 08:50:06.167 INFO  [main] [TRACE_ID:] [StartupInfoLogger.java:61] : Started TikaApiApplication in 3.386 seconds (JVM running for 4.925) 
2023-11-17 08:50:06.170 INFO  [main] [TRACE_ID:] [InitListener.java:28] : ---> 项目启动完毕, 准备初始化... 
2023-11-17 08:50:06.268 INFO  [main] [TRACE_ID:] [PaddleOCRParser.java:81] : ---> 当前 使用模型 NCNN 
2023-11-17 08:50:07.573 INFO  [main] [TRACE_ID:] [InitListener.java:31] : ---> 初始化完成.... 
2023-11-17 08:50:11.672 DEBUG [tika-1] [TRACE_ID:] [AttachmentServiceImpl.java:102] : ---> 上文文件路径:/tmp/tika/upload/20231117085011-204702443328821250.docx 
2023-11-17 08:50:12.434 WARN  [tika-1] [TRACE_ID:] [ResourceUtils.java:99] : Couldn't get resource: docx4j.properties 
2023-11-17 08:50:12.434 WARN  [tika-1] [TRACE_ID:] [Docx4jProperties.java:24] : Couldn't find/read docx4j.properties; docx4j.properties not found via classloader. 
2023-11-17 08:50:17.269 WARN  [tika-1] [TRACE_ID:] [XmlUtils.java:214] : actual SAXParserFactory: com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl 
2023-11-17 08:50:17.270 WARN  [tika-1] [TRACE_ID:] [XmlUtils.java:269] : actual DocumentBuilderFactory: com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl 
2023-11-17 08:50:17.427 DEBUG [tika-1] [TRACE_ID:] [Docx4jUtil.java:120] : ---> fileName = /tmp/tika/docx4j-images/image1.png mimeType = image/png PartName = /word/media/image1.png 
2023-11-17 08:50:17.432 DEBUG [tika-1] [TRACE_ID:] [Docx4jUtil.java:120] : ---> fileName = /tmp/tika/docx4j-images/image2.png mimeType = image/png PartName = /word/media/image2.png 
java: symbol lookup error: /tmpocrJava/ncnn/libRapidOcr.so: undefined symbol: _ZN2cv6imreadERKSsi
tika-server exited with code 127
MyMonsterCat commented 8 months ago

之前不能运行还是gcc、glibc等库的版本问题,如果想在centos7上运行,升级这些依赖库的版本就行了

nn200433 commented 8 months ago

之前不能运行还是gcc、glibc等库的版本问题,如果想在centos7上运行,升级这些依赖库的版本就行了

我这不是 centos 了,是 Docker 容器.... 依赖库这些,还是得要教程,不然自己整,太懵了。 期待教程跟完美运行的容器😂

nn200433 commented 7 months ago

我封了个spring-boot版的 代码 有需要的话可以 cv 修改😄

MyMonsterCat commented 7 months ago

我封了个spring-boot版的 代码 有需要的话可以 cv 修改😄

好的我有空了看看