aarnphm / whispercpp

Pybind11 bindings for Whisper.cpp
Apache License 2.0
324 stars 63 forks source link

bug: Cannot change transcription language #20

Closed mxpucci closed 1 year ago

mxpucci commented 1 year ago

Describe the bug

Even if the original audio is not an English speech, the transcription is always translated in English. I've tried to change the language property of params using api.Params.language = 'it' but it didn't work.

To reproduce

import ffmpeg
import numpy as np
from whispercpp import Whisper
from whispercpp import api

try:
    y, _ = (
        ffmpeg.input("/Users/michelangelopucci/Downloads/untitled folder 2/output.wav", threads=0)
        .output("-", format="s16le", acodec="pcm_s16le", ac=1)
        .run(
            cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True
        )
    )
except ffmpeg.Error as e:
    raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e

arr = np.frombuffer(y, np.int16).flatten().astype(np.float32) / 32768.0

api.Params.language = 'it'
w = Whisper.from_pretrained("large")
a = w.transcribe(arr)
print(a)

Expected behavior

No response

Environment

Python 3.9.6

aarnphm commented 1 year ago

You have to change it from the Whisper instance

w = w.Whisper.from_pretrained("tiny")
w.params.language = "it"
w.transcribe(arr)
mxpucci commented 1 year ago

Well, I tried also doing that, however I get this error whisper_lang_id: unknown language 'ӄ' In fact, after the language property is edited, accessing to w.params.language gets UnicodeDecodeError: 'utf-8' codec can't decode byte 0xce in position 0: invalid continuation byte

aarnphm commented 1 year ago

I think there is a bug with the params c_str right now. Feel free to put up a PR to fix it. It is in src/whispercpp/api_export.cc for the Params obj.

lasseedfast commented 1 year ago

Wish I could but I cannot... Hope someone else can fix this soon!