sepinf-inc / IPED

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Other
886 stars 211 forks source link

Make Wav2Vec2 transcription compatible with new versions of huggingsound library #2021

Closed pitanga closed 7 months ago

pitanga commented 7 months ago

Time, apresenta o erro abaixo na versão 4.1.5 e não ocorre na versão 4.1.4

2023-12-07 12:34:07 [INFO] [engine.core.Statistics] Java Command: iped.app.processing.Main -d C:\IPED\Audios\ -o D:\ANALISE\ 2023-12-07 12:34:10 [INFO] [engine.core.Manager] Evidence 1: 'C:\IPED\Audios' 2023-12-07 12:34:10 [INFO] [engine.core.Manager] Creating index: D:\ANALISE\iped\index 2023-12-07 12:34:10 [INFO] [engine.lucene.ConfiguredFSDirectory] Using MMapDirectory to open index... 2023-12-07 12:34:10 [INFO] [engine.core.Worker] Starting Tika Warning: Nashorn engine is planned to be removed from a future JDK release 2023-12-07 12:34:10 [INFO] [engine.core.Worker] Starting SkipCommitedTask 2023-12-07 12:34:10 [INFO] [engine.core.Worker] Starting IgnoreHardLinkTask 2023-12-07 12:34:10 [INFO] [engine.core.Worker] Starting TempFileTask 2023-12-07 12:34:10 [INFO] [engine.core.Worker] Starting HashTask 2023-12-07 12:34:10 [INFO] [engine.core.Worker] Starting SignatureTask 2023-12-07 12:34:10 [INFO] [engine.core.Worker] Starting SetTypeTask 2023-12-07 12:34:10 [INFO] [engine.core.Worker] Starting SetCategoryTask 2023-12-07 12:34:10 [INFO] [engine.core.Worker] Starting RefineCategoryTask 2023-12-07 12:34:10 [INFO] [engine.core.Worker] Starting HashDBLookupTask 2023-12-07 12:34:10 [INFO] [engine.task.HashDBLookupTask] NSRL product configurations loaded: 177 2023-12-07 12:34:10 [INFO] [engine.task.HashDBLookupTask] HashDB: C:\IPED\iped-hashes\iped-hashes.db 2023-12-07 12:34:10 [INFO] [engine.task.HashDBLookupTask] Exclude Known: true 2023-12-07 12:34:10 [INFO] [engine.task.HashDBLookupTask] Task enabled. 2023-12-07 12:34:10 [INFO] [engine.core.Worker] Starting DuplicateTask 2023-12-07 12:34:10 [INFO] [engine.core.Worker] Starting AudioTranscriptTask 2023-12-07 12:34:32 [MSG] [task.transcript.Wav2Vec2TranscriptTask$1] 12/07/2023 12:34:32 - INFO - huggingsound.speech_recognition.model - Loading model... 2023-12-07 12:34:33 [INFO] [task.transcript.Wav2Vec2TranscriptTask] Number of CUDA devices detected: 0 2023-12-07 12:34:33 [INFO] [task.transcript.Wav2Vec2TranscriptTask] Number of CPU devices detected: 2 iped.exception.IPEDException: Error loading 'jonatasgrosman/wav2vec2-xls-r-1b-portuguese' transcription model. at iped.engine.task.transcript.Wav2Vec2TranscriptTask.startServer(Wav2Vec2TranscriptTask.java:146) at iped.engine.task.transcript.Wav2Vec2TranscriptTask.init(Wav2Vec2TranscriptTask.java:92) at iped.engine.task.transcript.AudioTranscriptTask.init(AudioTranscriptTask.java:27) at iped.engine.core.Worker.initTasks(Worker.java:128) at iped.engine.core.Worker.(Worker.java:110) at iped.engine.core.Manager.initWorkers(Manager.java:543) at iped.engine.core.Manager.process(Manager.java:265) at iped.app.processing.Main.startManager(Main.java:178) at iped.app.processing.Main.execute(Main.java:240) at iped.app.processing.Main.main(Main.java:308) 2023-12-07 12:34:43 [ERROR] [app.processing.Main] Processing Error: Error loading 'jonatasgrosman/wav2vec2-xls-r-1b-portuguese' transcription model.

IPEDConfig.txt LocalConfig.txt AudioTranscriptConfig.txt

Alem disso, executei:

.\python.exe get-pip.py .\python.exe -m pip install huggingsound

Precisando de mais informações me avisem

lfcnassif commented 7 months ago

Please test the latest snapshot: https://github.com/sepinf-inc/IPED/suites/18790674772/artifacts/1095171448

In commit 4a0a4f866fa606277917d10b8bc973d1b74a1458 I fixed an issue related to new messages printed to stdout by new versions of huggingsound library that were causing an error like yours.

lfcnassif commented 7 months ago

I'm closing this as fixed. If the provided snapshot aborts with the same error, please reopen this ticket.