xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4k stars 329 forks source link

openai/whisper-large-v3的翻译xinference是否支持英翻译中 #1295

Open Songjiadong opened 3 months ago

Songjiadong commented 3 months ago

openai/whisper-large-v3的翻译xinference是否支持英翻译中?我看底层代码只写了中翻英?是否可以重写参数,如何实现?谢谢

    def translations(
        self,
        audio: bytes,
        prompt: Optional[str] = None,
        response_format: Optional[str] = "json",
        temperature: Optional[float] = 0,
    ):
        """
        Translates audio into English.

        Parameters
        ----------

        audio: bytes
            The audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg,
            mpga, m4a, ogg, wav, or webm.
        prompt: Optional[str]
            An optional text to guide the model's style or continue a previous audio segment.
            The prompt should match the audio language.
        response_format: Optional[str], defaults to json
            The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt.
        temperature: Optional[float], defaults to 0
            The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random,
            while lower values like 0.2 will make it more focused and deterministic.
            If set to 0, the model will use log probability to automatically increase the temperature
            until certain thresholds are hit.

        Returns
        -------
            The translated text.
        """
        url = f"{self._base_url}/v1/audio/translations"
        params = {
            "model": self._model_uid,
            "prompt": prompt,
            "response_format": response_format,
            "temperature": temperature,
        }
        files: List[Any] = []
        for key, value in params.items():
            files.append((key, (None, value)))
        files.append(("file", ("file", audio, "application/octet-stream")))
        response = requests.post(url, files=files, headers=self.auth_headers)
        if response.status_code != 200:
            raise RuntimeError(
                f"Failed to translate the audio, detail: {_get_error_string(response)}"
            )

        response_data = response.json()
        return response_data
qinxuye commented 3 months ago

你可以尝试一下,这个和模型本身的能力有关。

codingl2k1 commented 3 months ago

你可以试试transcriptions接口,它可以指定语言。而open ai的translations接口只是翻译成英文:https://platform.openai.com/docs/api-reference/audio/createTranslation