Open franclinbarros opened 9 years ago
Recently Tika, the library used by autopsy to extract text from files before indexing them, had OCR support added for images and pdf's, using tesseract. Autopsy needs to update tika, embed tesseract and do some configuration to tell tika the tesseract path.
This would be a great addition to autopsy. Especially as commercial tools do this and as far as i know no opensource project so far. My company uses a commercial software and autopsy only becouse of the OCR functionality.
We should have an ingest module that does OCR using engines such as Tesseract.