keensoft / alfresco-simple-ocr

Simple OCR action for Alfresco
Other
44 stars 30 forks source link

"Transformer Not Called" in logs when document is updated in a folder with Extract OCR rule. #62

Closed syedjunaidihussain closed 4 years ago

syedjunaidihussain commented 4 years ago

Hi, I am using Alfresco Community v 6.0 (Dockerized) . I deployed both jars by copying them in repo and share's WEB-INF/lib. After deploying them Extract OCR Action was avialable but the problum occurs when I upload a document in my rule defined folder and after that I check the logs OCR doesn't find any transformer.

Here are the logs

2019-12-06 10:26:24,788 INFO [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [http-nio-8080-exec-2] Starting 'Transformers' subsystem, ID: [Transformers, default] 2019-12-06 10:26:25,260 INFO [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [http-nio-8080-exec-2] Startup of 'Transformers' subsystem, ID: [Transformers, default] complete 2019-12-06 10:26:25,463 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 0 tiff png ccitt_1.tif 17.9 KB -- doclib -- ContentService.transform(...) 2019-12-06 10:26:25,465 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 0 workspace://SpacesStore/10c416f7-e6fe-49c8-9a0b-4eed04dd55b1 2019-12-06 10:26:25,465 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 0 **a) [100] ImageMagick<> 0 ms 2019-12-06 10:26:25,466 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 0.1 tiff png ccitt_1.tif 17.9 KB ImageMagick<> 2019-12-06 10:26:25,619 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 0 Finished in 158 ms

2019-12-06 10:26:50,121 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 1 tiff txt ccitt_1.tif 17.9 KB -- index -- SolrIndexer NO transformers 2019-12-06 10:26:50,123 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 1 workspace://SpacesStore/10c416f7-e6fe-49c8-9a0b-4eed04dd55b1 2019-12-06 10:26:50,124 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 1 Finished in 5 ms Transformer NOT called

I am using pdfsanwitch version 0.1.6 , ImageMagick 7.0.7-27 Q16 x86_64 & tesseract 3.05.

syedjunaidihussain commented 4 years ago

I resolved the problem mentioned above. It was happening because I didn't have any language packages in my tesseract's "tessdata" directory and other problem was I was uploading a tif document. Now im facing another problem, my pdf document is converted into text but is not uploaded in alfresco .

Here are the Logs tt2019-12-10 12:22:59,036 INFO [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [http-nio-8080-exec-7] Starting 'Transformers' subsystem, ID: [Transformers, default] 2019-12-10 12:22:59,401 INFO [org.alfresco.repo.management.subsystems.ChildApplicationContextFactory] [http-nio-8080-exec-7] Startup of 'Transformers' subsystem, ID: [Transformers, default] complete 2019-12-10 12:22:59,407 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-7] 0 store://2019/12/10/12/22/318d7a48-666d-4ee0-a0c2-c383a346210e.bin 2019-12-10 12:22:59,408 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-7] 0 application/pdf image/png 2019-12-10 12:22:59,409 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-7] 0 pdf png mydoc.pdf 86.9 KB -- doclib -- ContentService.getTransformer(...) 2019-12-10 12:22:59,409 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-7] 0 workspace://SpacesStore/ba79a528-952c-4f02-8e2a-8ee5fb2f8ad2 2019-12-10 12:22:59,414 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-7] 0 **a) [50] alfresco-pdf-renderer<> 0 ms 2019-12-10 12:22:59,418 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-7] 0 b) [60] complex.PDF.Image<> 0 ms 2019-12-10 12:22:59,418 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-7] 0 c) [100] ImageMagick<> 0 ms 2019-12-10 12:22:59,419 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-7] 0 Finished in 383 ms Transformer NOT called

2019-12-10 12:22:59,567 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 1 store://2019/12/10/12/22/318d7a48-666d-4ee0-a0c2-c383a346210e.bin 2019-12-10 12:22:59,571 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 1 application/pdf image/png 2019-12-10 12:22:59,572 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 1 pdf png mydoc.pdf 86.9 KB -- doclib -- ContentService.getTransformer(...) 2019-12-10 12:22:59,572 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 1 workspace://SpacesStore/ba79a528-952c-4f02-8e2a-8ee5fb2f8ad2 2019-12-10 12:22:59,573 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 1 **a) [50] alfresco-pdf-renderer<> 0 ms 2019-12-10 12:22:59,576 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 1 b) [60] complex.PDF.Image<> 0 ms 2019-12-10 12:22:59,577 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 1 c) [100] ImageMagick<> 0 ms 2019-12-10 12:22:59,578 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 1 Finished in 17 ms Transformer NOT called

2019-12-10 12:22:59,605 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 2 store://2019/12/10/12/22/318d7a48-666d-4ee0-a0c2-c383a346210e.bin 2019-12-10 12:22:59,612 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 2 application/pdf image/png 2019-12-10 12:22:59,613 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 2 pdf png mydoc.pdf 86.9 KB -- doclib -- ContentService.getTransformer(...) 2019-12-10 12:22:59,613 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 2 workspace://SpacesStore/ba79a528-952c-4f02-8e2a-8ee5fb2f8ad2 2019-12-10 12:22:59,613 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 2 **a) [50] alfresco-pdf-renderer<> 0 ms 2019-12-10 12:22:59,614 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 2 b) [60] complex.PDF.Image<> 0 ms 2019-12-10 12:22:59,614 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 2 c) [100] ImageMagick<> 0 ms 2019-12-10 12:22:59,615 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [defaultAsyncAction3] 2 Finished in 14 ms Transformer NOT called

2019-12-10 12:22:59,622 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 3 store://2019/12/10/12/22/318d7a48-666d-4ee0-a0c2-c383a346210e.bin 2019-12-10 12:22:59,623 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 3 application/pdf image/png 2019-12-10 12:22:59,624 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 3 pdf png mydoc.pdf 86.9 KB -- doclib -- ContentService.transform(...) 2019-12-10 12:22:59,624 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 3 workspace://SpacesStore/ba79a528-952c-4f02-8e2a-8ee5fb2f8ad2 2019-12-10 12:22:59,626 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 3 **a) [50] alfresco-pdf-renderer<> 0 ms 2019-12-10 12:22:59,627 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 3 b) [60] complex.PDF.Image<> 0 ms 2019-12-10 12:22:59,627 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 3 c) [100] ImageMagick<> 0 ms 2019-12-10 12:22:59,628 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 3.1 store://2019/12/10/12/22/318d7a48-666d-4ee0-a0c2-c383a346210e.bin 2019-12-10 12:22:59,628 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 3.1 application/pdf image/png 2019-12-10 12:22:59,628 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 3.1 pdf png mydoc.pdf 86.9 KB alfresco-pdf-renderer<> 2019-12-10 12:23:00,022 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 3.1 Finished in 394 ms 2019-12-10 12:23:00,038 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [pool-14-thread-1] 3 Finished in 402 ms

2019-12-10 12:23:10,120 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 4 store://2019/12/10/11/53/acada5c4-4efa-4170-97f5-e29c3bbe1619.bin 2019-12-10 12:23:10,121 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 4 application/pdf text/plain 2019-12-10 12:23:10,121 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 4 pdf txt domicile.pdf 230.5 KB -- index -- SolrIndexer 2019-12-10 12:23:10,122 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 4 archive://SpacesStore/eb590976-4611-402a-b033-ad634c338fd1 2019-12-10 12:23:10,122 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 4 **a) [50] PdfBox < 25 MB 0 ms 2019-12-10 12:23:10,122 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 4 b) [120] TikaAuto < 25 MB 0 ms 2019-12-10 12:23:10,124 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 4.1 store://2019/12/10/11/53/acada5c4-4efa-4170-97f5-e29c3bbe1619.bin 2019-12-10 12:23:10,124 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 4.1 application/pdf text/plain 2019-12-10 12:23:10,124 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 4.1 pdf txt domicile.pdf 230.5 KB PdfBox 2019-12-10 12:23:10,231 TRACE [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 4.1 Finished in 108 ms 2019-12-10 12:23:10,236 DEBUG [org.alfresco.repo.content.transform.TransformerDebug] [http-nio-8080-exec-4] 4 Finished in 117 ms