elan-ev / vosk-cli

Apache License 2.0
2 stars 9 forks source link

Missing Dependencies #5

Closed gregorydlogan closed 2 years ago

gregorydlogan commented 2 years ago

In testing https://github.com/opencast/opencast/pull/3806, I installed vosk-cli per the current README. Functional testing with an Opencast workflow spat out this error:

2022-06-03 12:25:22,763 | ERROR | (AbstractJobProducer$JobRunner:343) - Error handling operation 'speechtotext':                                              
org.opencastproject.speechtotext.api.SpeechToTextServiceException: Error while generating subtitle from http://localhost:8080/files/mediapackage/a378a5fe-9cfd
-45e0-a9f0-69cdadbfbdb6/d5d24af2-d40c-4f39-b56f-91d81b5b9a0c/nonsegment_audio.mpg                                                                             
        at org.opencastproject.speechtotext.impl.SpeechToTextServiceImpl.process(SpeechToTextServiceImpl.java:166) ~[?:?]                                     
        at org.opencastproject.job.api.AbstractJobProducer$JobRunner.call(AbstractJobProducer.java:313) [!/:?]                                                
        at org.opencastproject.job.api.AbstractJobProducer$JobRunner.call(AbstractJobProducer.java:272) [!/:?]                                                
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]                                                                                     
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]                                                              
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]                                                              
        at java.lang.Thread.run(Thread.java:829) [?:?]                                                                                                        
Caused by: org.opencastproject.speechtotext.api.SpeechToTextEngineException: org.opencastproject.speechtotext.api.SpeechToTextEngineException: Vosk exited abn
ormally with status 1 (command: [vosk-cli, -i, /home/greg/opencast/upstream/build/opencast-dist-allinone/data/opencast/workspace/mediapackage/a378a5fe-9cfd-45
e0-a9f0-69cdadbfbdb6/d5d24af2-d40c-4f39-b56f-91d81b5b9a0c/nonsegment_audio.mpg, -o, /home/greg/opencast/upstream/build/opencast-dist-allinone/data/opencast/wo
rkspace/collection/subtitles/tmp_1773_nonsegment_audio.vtt, -l, ara])                                                                                         
 Output:                                                                                                                                                      
Traceback (most recent call last):                                                                                                                            
  File "/home/greg/.local/bin/vosk-cli", line 5, in <module>                                                                                                  
    from scripts.transcribe import main                                                                                                                       
ModuleNotFoundError: No module named 'scripts'                                                                                                                

        at org.opencastproject.speechtotext.impl.engine.VoskEngine.generateSubtitlesFile(VoskEngine.java:123) ~[?:?]                                          
        at org.opencastproject.speechtotext.impl.SpeechToTextServiceImpl.process(SpeechToTextServiceImpl.java:156) ~[?:?]                                     
        ... 6 more                                                                                                                                            
Caused by: org.opencastproject.speechtotext.api.SpeechToTextEngineException: Vosk exited abnormally with status 1 (command: [vosk-cli, -i, /home/greg/opencast
/upstream/build/opencast-dist-allinone/data/opencast/workspace/mediapackage/a378a5fe-9cfd-45e0-a9f0-69cdadbfbdb6/d5d24af2-d40c-4f39-b56f-91d81b5b9a0c/nonsegme
nt_audio.mpg, -o, /home/greg/opencast/upstream/build/opencast-dist-allinone/data/opencast/workspace/collection/subtitles/tmp_1773_nonsegment_audio.vtt, -l, ar
a])                                                                                                                                                           
 Output:                                                                                                                                                      
Traceback (most recent call last):                                                                                                                            
  File "/home/greg/.local/bin/vosk-cli", line 5, in <module>                                                                                                  
    from scripts.transcribe import main                                                                                                                       
ModuleNotFoundError: No module named 'scripts'                                                                                                                

        at org.opencastproject.speechtotext.impl.engine.VoskEngine.generateSubtitlesFile(VoskEngine.java:115) ~[?:?]                                          
        at org.opencastproject.speechtotext.impl.SpeechToTextServiceImpl.process(SpeechToTextServiceImpl.java:156) ~[?:?]                                     
        ... 6 more                                                                                                                                            
2022-06-03 12:25:25,656 | ERROR | (WorkflowOperationWorker:140) - Workflow operation 'operation:'speechtotext, state:'FAILED'' failed                         
org.opencastproject.workflow.api.WorkflowOperationException: Speech-to-Text job for media package 'a378a5fe-9cfd-45e0-a9f0-69cdadbfbdb6' failed
        at org.opencastproject.workflow.handler.speechtotext.SpeechToTextWorkflowOperationHandler.createSubtitle(SpeechToTextWorkflowOperationHandler.java:181) ~[?:?]
        at org.opencastproject.workflow.handler.speechtotext.SpeechToTextWorkflowOperationHandler.start(SpeechToTextWorkflowOperationHandler.java:146) ~[?:?]
        at org.opencastproject.workflow.impl.WorkflowOperationWorker.start(WorkflowOperationWorker.java:212) ~[!/:?]
        at org.opencastproject.workflow.impl.WorkflowOperationWorker.execute(WorkflowOperationWorker.java:117) [!/:?]
        at org.opencastproject.workflow.impl.WorkflowServiceImpl.runWorkflowOperation(WorkflowServiceImpl.java:719) [!/:?]
        at org.opencastproject.workflow.impl.WorkflowServiceImpl.process(WorkflowServiceImpl.java:1736) [!/:?]
        at org.opencastproject.workflow.impl.WorkflowServiceImpl$JobRunner.call(WorkflowServiceImpl.java:2097) [!/:?]
        at org.opencastproject.workflow.impl.WorkflowServiceImpl$JobRunner.call(WorkflowServiceImpl.java:2063) [!/:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:829) [?:?]

Installing scripts with pip install scripts does not resolve the issue.

lkiesow commented 2 years ago

How did you install vosk-cli?

gregorydlogan commented 2 years ago

pip install vosk then pip install $deps, then pip install . in the clone.

lkiesow commented 2 years ago

I'm not quite sure what pip install . does. But scripts is the (arguably not that great) name for the vosk-cli package, which it seems like is not properly installed. I would expect the installation process to be:

❯ pip install vosk webvtt-py setuptools
❯ python setup.py install

(and usually, I would run that in a virtual environment)

gregorydlogan commented 2 years ago

pip install . is the first step in the readme :D

I skipped the venv because I wanted this set up for my normal user, but I can try and repro on Monday with a venv on another machine.

gregorydlogan commented 2 years ago

Interestingly, when I run the command from the logfile in a different window I get a correctly transcribed vtt file - no errors, no nothing. Since this works as expected when run by itself, I'm going to guess this is some kind of Opencast issue.