sepinf-inc / IPED

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Other
884 stars 209 forks source link

Support for tesseract 5.4 #2241

Closed aberenguel closed 3 weeks ago

aberenguel commented 3 weeks ago

I've upgraded the tesseract of my machine (Detected tesseract version 5.4.0-rc2-17-g3469).

When testing with maven, some tests failed (output log set to ERROR):

15:33:07.674 [Thread-581] ERROR iped.parsers.ocr.OCRParser - OCR msg from testOCRParserTIFF     Estimating resolution as 522
15:33:10.313 [Thread-583] ERROR iped.parsers.ocr.OCRParser - OCR msg from testOCRParserTIFF     Estimating resolution as 522
15:33:12.044 [Thread-585] ERROR iped.parsers.ocr.OCRParser - OCR msg from testOCRParserPDF      Estimating resolution as 399
15:33:13.457 [Thread-587] ERROR iped.parsers.ocr.OCRParser - OCR msg from testOCRParserPNG      Estimating resolution as 234
15:33:14.971 [Thread-589] ERROR iped.parsers.ocr.OCRParser - OCR msg from testOCRParserPNG      Estimating resolution as 234
15:33:19.040 [Thread-591] ERROR iped.parsers.ocr.OCRParser - OCR msg from testOCRParserPSD      Estimating resolution as 727
15:33:21.088 [Thread-593] ERROR iped.parsers.ocr.OCRParser - OCR msg from testOCRParserSVG      Estimating resolution as 871

...

Tests in error: 
  testOCRParserTIFF(iped.parsers.ocr.OCRParserTest): tesseract returned error code 136
  testOCRParserPDF(iped.parsers.ocr.OCRParserTest): tesseract returned error code 136
  testOCRParserPNG(iped.parsers.ocr.OCRParserTest): tesseract returned error code 136
  testOCRParserPSD(iped.parsers.ocr.OCRParserTest): tesseract returned error code 136
  testOCRParserSVG(iped.parsers.ocr.OCRParserTest): tesseract returned error code 136
aberenguel commented 3 weeks ago

I think it is a bug in tesseract. Running the command in shell:

Estimating resolution as 234
Exceção de ponto flutuante (imagem do núcleo gravada)
aberenguel commented 3 weeks ago

Just reverted tesseract to version 5.3.4. Now it is working again.