sepinf-inc / IPED

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Other
924 stars 217 forks source link

Audios not retried and skipped if specific errors happen in remote transcription service #1942

Closed lfcnassif closed 10 months ago

lfcnassif commented 10 months ago

An user reported the transcription service was not working last week and sent a processing log with following errors:

2023-10-17 09:00:45 [WARN]  [task.transcript.RemoteWav2Vec2TranscriptTask]          Fail to transcribe on server: XXX  audio: XXX.opus error: Exception while transcribing: java.lang.RuntimeException: iped.exception.IPEDException: 'huggingsound' python lib not loaded correctly. Have you installed it?.
2023-10-17 09:01:13 [WARN]  [task.transcript.RemoteWav2Vec2TranscriptTask]          Fail to transcribe on server: XXX  audio: XXX.opus error: Exception while transcribing: java.lang.RuntimeException: iped.exception.IPEDException: 'huggingsound' python lib not loaded correctly. Have you installed it?.
2023-10-17 09:01:13 [WARN]  [task.transcript.RemoteWav2Vec2TranscriptTask]          Fail to transcribe on server: XXX  audio: XXX.opus error: Exception while transcribing: java.lang.RuntimeException: iped.exception.IPEDException: Error loading '/home/transcript/.cache/huggingface/hub/models--jonatasgrosman--wav2vec2-xls-r-1b-portuguese/snapshots/8926743abe7e95bb81b64305cb3c5fa85173f6b0' transcription model..
2023-10-17 09:01:17 [WARN]  [task.transcript.RemoteWav2Vec2TranscriptTask]          Fail to transcribe on server: XXX  audio: XXX.opus error: Exception while transcribing: java.lang.RuntimeException: iped.exception.IPEDException: Error loading '/home/transcript/.cache/huggingface/hub/models--jonatasgrosman--wav2vec2-xls-r-1b-portuguese/snapshots/8926743abe7e95bb81b64305cb3c5fa85173f6b0' transcription model..
2023-10-17 09:01:18 [WARN]  [task.transcript.RemoteWav2Vec2TranscriptTask]          Fail to transcribe on server: XXX  audio: XXX.opus error: Exception while transcribing: java.lang.RuntimeException: iped.exception.IPEDException: Error loading '/home/transcript/.cache/huggingface/hub/models--jonatasgrosman--wav2vec2-xls-r-1b-portuguese/snapshots/8926743abe7e95bb81b64305cb3c5fa85173f6b0' transcription model..
2023-10-17 09:01:18 [WARN]  [task.transcript.RemoteWav2Vec2TranscriptTask]          Fail to transcribe on server: XXX  audio: XXX.opus error: Exception while transcribing: java.lang.RuntimeException: iped.exception.IPEDException: Error loading '/home/transcript/.cache/huggingface/hub/models--jonatasgrosman--wav2vec2-xls-r-1b-portuguese/snapshots/8926743abe7e95bb81b64305cb3c5fa85173f6b0' transcription model..
2023-10-17 09:01:18 [WARN]  [task.transcript.RemoteWav2Vec2TranscriptTask]          Fail to transcribe on server: XXX  audio: XXX.opus error: Exception while transcribing: java.lang.RuntimeException: iped.exception.IPEDException: Error loading '/home/transcript/.cache/huggingface/hub/models--jonatasgrosman--wav2vec2-xls-r-1b-portuguese/snapshots/8926743abe7e95bb81b64305cb3c5fa85173f6b0' transcription model..
2023-10-17 09:01:18 [WARN]  [task.transcript.RemoteWav2Vec2TranscriptTask]          Fail to transcribe on server: XXX  audio: XXX.opus error: Exception while transcribing: java.lang.RuntimeException: iped.exception.IPEDException: Error loading '/home/transcript/.cache/huggingface/hub/models--jonatasgrosman--wav2vec2-xls-r-1b-portuguese/snapshots/8926743abe7e95bb81b64305cb3c5fa85173f6b0' transcription model..
2023-10-17 09:01:19 [WARN]  [task.transcript.RemoteWav2Vec2TranscriptTask]          Fail to transcribe on server: XXX  audio: XXX.opus error: Exception while transcribing: java.lang.RuntimeException: iped.exception.IPEDException: Error loading '/home/transcript/.cache/huggingface/hub/models--jonatasgrosman--wav2vec2-xls-r-1b-portuguese/snapshots/8926743abe7e95bb81b64305cb3c5fa85173f6b0' transcription model..

Errors above in the transcription service should never happen because the node is added to the cluster just after the model is loaded correctly on GPU. But @hauck-jvsh reported sometimes the nodes Linux kernel (or GPU driver?) is being updated automatically, nodes are not restarted and the service continues running, returning errors above. @hauck-jvsh please try to disable automatic updates.

Anyway, if errors above are returned, we could add some protection in client code to remove the node from the cluster set and warn in the console so clients could warn the service maintainers about problematic nodes.