linto-ai / whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence
GNU Affero General Public License v3.0
2.06k stars 156 forks source link

whisper_timestamped blocks from an URL in CLI into subprocess module #141

Closed dchapelet closed 1 year ago

dchapelet commented 1 year ago

Hi,

I have a new CLI problem in subprocess module. whisper_timestamped gets stuck at the end (100%) Here my code.

url = 'http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/WhatCarCanYouGetForAGrand.mp4'
result = subprocess.Popen(['whisper_timestamped', url, '--model', 'tiny.en', '--language', 'en', '--device', device, '--fp16', str(settings.GPU), '--no_speech_threshold', str(no_speech_threshold), '--detect_disfluencies', str(True), '--vad', str(True), '--verbose', str(False)], text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    while True:
        line = result.stderr.readline()
        if not line:
            break
        print(line)

        current = line[:3].lstrip()
        if len(current) > 0 and current.isdigit():
            self.update_state(state='PROGRESS', meta={'current': int(current), 'total': 100})

    self.update_state(state='COMPLETED', meta={'current': 100, 'total': 100})

`[2023-11-18 13:46:46,354: WARNING/MainProcess] warnings.warn( [2023-11-18 13:46:46,650: WARNING/MainProcess] Downloading: "https://github.com/snakers4/silero-vad/zipball/master" to /root/.cache/torch/hub/master.zip [2023-11-18 13:47:02,256: WARNING/MainProcess] 0%| | 0/50969 [00:00<?, ?frames/s] [2023-11-18 13:47:07,130: WARNING/MainProcess] 5%|▌ | 2800/50969 [00:04<01:09, 690.67frames/s] [2023-11-18 13:47:11,983: WARNING/MainProcess] 11%|█ | 5700/50969 [00:08<01:11, 630.14frames/s] [2023-11-18 13:47:16,589: WARNING/MainProcess] 17%|█▋ | 8500/50969 [00:13<01:10, 605.00frames/s] [2023-11-18 13:47:21,913: WARNING/MainProcess] 22%|██▏ | 11200/50969 [00:18<01:06, 597.64frames/s] [2023-11-18 13:47:26,937: WARNING/MainProcess] 27%|██▋ | 13900/50969 [00:23<01:05, 562.13frames/s] [2023-11-18 13:47:31,617: WARNING/MainProcess] 33%|███▎ | 16700/50969 [00:28<01:01, 560.46frames/s] [2023-11-18 13:47:36,446: WARNING/MainProcess] 38%|███▊ | 19600/50969 [00:33<00:54, 579.11frames/s] [2023-11-18 13:47:42,008: WARNING/MainProcess] 44%|████▍ | 22500/50969 [00:38<00:48, 585.92frames/s] [2023-11-18 13:47:46,888: WARNING/MainProcess] 50%|████▉ | 25400/50969 [00:43<00:45, 563.79frames/s] [2023-11-18 13:47:52,120: WARNING/MainProcess] 56%|█████▌ | 28300/50969 [00:48<00:39, 572.98frames/s] [2023-11-18 13:47:57,758: WARNING/MainProcess] 61%|██████ | 31100/50969 [00:53<00:35, 561.03frames/s] [2023-11-18 13:48:02,851: WARNING/MainProcess] 67%|██████▋ | 34000/50969 [00:59<00:31, 545.81frames/s] [2023-11-18 13:48:08,555: WARNING/MainProcess] 72%|███████▏ | 36900/50969 [01:04<00:25, 552.81frames/s] [2023-11-18 13:48:13,449: WARNING/MainProcess] 78%|███████▊ | 39800/50969 [01:10<00:20, 538.51frames/s] [2023-11-18 13:48:19,866: WARNING/MainProcess] 84%|████████▍ | 42700/50969 [01:15<00:14, 553.80frames/s] [2023-11-18 13:48:24,537: WARNING/MainProcess] 90%|████████▉ | 45696/50969 [01:21<00:10, 523.69frames/s] [2023-11-18 13:48:28,451: WARNING/MainProcess] 95%|█████████▍| 48384/50969 [01:26<00:04, 537.39frames/s] [2023-11-18 13:48:28,453: WARNING/MainProcess] 100%|██████████| 50969/50969 [01:30<00:00, 567.01frames/s] [2023-11-18 13:48:28,456: WARNING/MainProcess] 100%|██████████| 50969/50969 [01:30<00:00, 564.75frames/s]

Then whisper_timestamped is blocked. Thank you for you help, Best regards, David.

Jeronymous commented 1 year ago

Hmm I think relying on result.stderr.readline() to decide when to stop waiting is awkward. I recommend you should use subprocess in another way. Something like result.poll() / result.communicate() / result.wait()

dchapelet commented 1 year ago

Hi,

I tested with poll() but the whisper timestamped process always gets blocked if it is a URL. Here are 2 tests (a file and a URL)

result` = subprocess.Popen(['whisper_timestamped', url, '--model', 'tiny.en', '--language', 'en', '--device', device, '--fp16', str(settings.GPU), '--no_speech_threshold', str(no_speech_threshold), '--detect_disfluencies', str(True), '--vad', str(True), '--verbose', str(False)], text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    while result.poll() is None:
        line = result.stderr.readline()
        print(line)
        print('result.poll(): {}'.format(result.poll()))
        print('result.returncode: {}'.format(result.returncode))

    if result.returncode == 0:
        print('The process completed successfully.')
    else:
        print('The process returned an error with the code: ', `process.returncode)

### Test with url = '/mnt/bob/sintel-2048-stereo [00.01.51.541-00.02.24.041].mp4' => it works !!!

openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:14,224: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:14,225: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:14,225: WARNING/MainProcess] warnings.warn( openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:14,225: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:14,225: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:15,068: WARNING/MainProcess] Downloading: "https://github.com/snakers4/silero-vad/zipball/master" to /root/.cache/torch/hub/master.zip openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:15,068: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:15,068: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:19,529: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:19,529: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,205: WARNING/MainProcess] 0%| | 0/2700 [00:00<?, ?frames/s] openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,205: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,205: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,208: WARNING/MainProcess] 100%|██████████| 2700/2700 [00:02<00:00, 1009.27frames/s] openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,208: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,208: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,210: WARNING/MainProcess] 100%|██████████| 2700/2700 [00:02<00:00, 1009.13frames/s] openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,211: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,211: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,908: WARNING/MainProcess] result.poll(): 0 openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,908: WARNING/MainProcess] result.returncode: 0 openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,910: WARNING/MainProcess] The process completed successfully. openai-whisper-api-celery_worker-1 | [2023-11-19 11:24:22,914: INFO/MainProcess] Task app.tasks.transcribe_v2[3c9b96b1-a043-4037-8a0b-c5d2d6a57560] succeeded in 31.483451699999932s: {'text': ' in a shed much more innocent blood. You\'re a fool for traveling alone so completely unprepared. You\'re lucky you\'re blessed to a flooring. Thank you. So, what brings you to the land of the gatekeepers? I\'m searching for someone. Someone very dear. A kindred spirit. A dragon.', 'segments': [{'start': 0.07, 'end': 2.47, 'text': ' in a shed much more innocent blood.', 'confidence': 0.59, 'words': [...]}, {'start': 6.21, 'end': 8.75, 'text': ' You\'re a fool for traveling alone so completely unprepared.', 'confidence': 0.83, 'words': [...]}, {'start': 9.87, 'end': 11.53, 'text': ' You\'re lucky you\'re blessed to a flooring.', 'confidence': 0.72, 'words': [...]}, {'start': 13.51, 'end': 14.07, 'text': ' Thank you.', 'confidence': 0.99, 'words': [...]}, {'start': 17.45, 'end': 19.77, 'text': ' So, what brings you to the land of the gatekeepers?', 'confidence': 0.88, 'words': [...]}, {'start': 23.53, 'end': 24.73, 'text': ' I\'m searching for someone.', 'confidence': 0.89, 'words': [...]}, {'start': , ...}]}

### Test with url = 'http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/WhatCarCanYouGetForAGrand.mp4' => it doesn't work because the processus is blocked

openai-whisper-api-celery_worker-1 | [2023-11-19 11:28:53,667: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:28:53,667: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:28:59,615: WARNING/MainProcess] 67%|██████▋ | 34000/50969 [01:16<00:39, 427.43frames/s] openai-whisper-api-celery_worker-1 | [2023-11-19 11:28:59,615: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:28:59,616: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:05,608: WARNING/MainProcess] 72%|███████▏ | 36900/50969 [01:22<00:31, 444.16frames/s] openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:05,609: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:05,609: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:10,793: WARNING/MainProcess] 78%|███████▊ | 39800/50969 [01:28<00:24, 455.51frames/s] openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:10,793: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:10,793: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:17,639: WARNING/MainProcess] 84%|████████▍ | 42700/50969 [01:33<00:17, 482.64frames/s] openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:17,639: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:17,639: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:22,685: WARNING/MainProcess] 90%|████████▉ | 45696/50969 [01:40<00:11, 467.79frames/s] openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:22,685: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:22,685: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:26,791: WARNING/MainProcess] 95%|█████████▍| 48384/50969 [01:45<00:05, 484.52frames/s] openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:26,791: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:26,791: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:26,793: WARNING/MainProcess] 100%|██████████| 50969/50969 [01:49<00:00, 517.98frames/s] openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:26,794: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:26,794: WARNING/MainProcess] result.returncode: None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:26,797: WARNING/MainProcess] 100%|██████████| 50969/50969 [01:49<00:00, 463.56frames/s] openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:26,797: WARNING/MainProcess] result.poll(): None openai-whisper-api-celery_worker-1 | [2023-11-19 11:29:26,797: WARNING/MainProcess] result.returncode: None

Thank you very much for your help. Best regards, David

Jeronymous commented 1 year ago

What is hanging is this:

line = result.stderr.readline()

I don't understand the reason. Maybe have a look at https://stackoverflow.com/questions/12419198/python-subprocess-readlines-hangs

dchapelet commented 1 year ago

Hi, To remove doubt, I tested with whisper instead of _whispertimestamped and it works with the URL below.

It works with whisper module !!!

    url = 'http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/WhatCarCanYouGetForAGrand.mp4'
    result = subprocess.Popen(['whisper', url, '--model', 'tiny.en', '--language', 'en', '--device', device, '--fp16', str(settings.GPU), '--no_speech_threshold', str(no_speech_threshold), '--verbose', str(False)], text=True)
result.wait()

    if result.returncode == 0:
        print('The process completed successfully.')
    else:
        print('The process returned an error with the code: ', process.returncode)

It doesn't work with whisper_timestamped module

    url = 'http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/WhatCarCanYouGetForAGrand.mp4'
    result = subprocess.Popen(['whisper_timestamped', url, '--model', 'tiny.en', '--language', 'en', '--device', device, '--fp16', str(settings.GPU), '--no_speech_threshold', str(no_speech_threshold), '--verbose', str(False)], text=True)
result.wait()

    if result.returncode == 0:
        print('The process completed successfully.')
    else:
        print('The process returned an error with the code: ', process.returncode)

I suspect something is not closed properly

Than you very much for your help. Best regards, David

dchapelet commented 1 year ago

Hi, Thank you for your help, I hotfix my code.

I was waiting for stdout and stderr and p.Poll() was waiting for something from stdout because I was only reading from stderr. To fix, I redirected the stderr to stdout.

BEFORE :

p = subprocess.Popen(['whisper_timestamped', url, '--model', 'tiny.en', '--language', 'en', '--device', device, '--fp16', str(settings.GPU), '--no_speech_threshold', str(no_speech_threshold), '--detect_disfluencies', str(True), '--vad', str(True), '--verbose', str(False)], text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
while p.poll() is None:
        line = p.stderr.readline()

AFTER :

p = subprocess.Popen(['whisper_timestamped', url, '--model', 'tiny.en', '--language', 'en', '--device', device, '--fp16', str(settings.GPU), '--no_speech_threshold', str(no_speech_threshold), '--detect_disfluencies', str(True), '--vad', str(True), '--verbose', str(False)], text=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while p.poll() is None:
        line = p.stdout.readline()

Now everything works perfectly. Best regards, David

Jeronymous commented 1 year ago

Perfect. Well done, and thanks for the feedback.