Intent of accepting additional AudioFileName parameter for overloaded TranscribeAudio and TranslateAudio APIs which already accept an AudioStream input parameter

openai / openai-dotnet

The official .NET library for the OpenAI API

https://www.nuget.org/packages/OpenAI

MIT License

707 stars 60 forks source link

Intent of accepting additional AudioFileName parameter for overloaded TranscribeAudio and TranslateAudio APIs which already accept an AudioStream input parameter #71

Open cjkarande opened 1 week ago

cjkarande commented 1 week ago

Hello @trrwilson,

Is there a specific Intent of accepting additional AudioFileName parameter for overloaded TranscribeAudio and TranslateAudio APIs which already accept an AudioStream input parameter?

public virtual ClientResult TranscribeAudio(Stream audio, string audioFilename, AudioTranscriptionOptions options = null, CancellationToken cancellationToken = default(CancellationToken))

public virtual async Task<ClientResult> TranslateAudio(Stream audio, string audioFilename, AudioTranslationOptions options = null, CancellationToken cancellationToken = default(CancellationToken))

trrwilson commented 1 week ago

Hello, @cjkarande!

The transcription and translation APIs require this filename data as part of the multipart/form-data request content -- the filename is used by the service to infer the format of the input audio and decode it properly, as there's no other "hint" about whether the binary data should be interpreted as MP3, WAV, FLAC, etc.. That's why a filename shows up in each of the overloads:

One that accepts a Stream and a filename (the filename used exclusively for this "format hint" behavior)
One that just accepts a local filename (and reads the Stream automatically)

Is there another overload or input option you'd expect or like to see for these methods?

cjkarande commented 1 week ago

@trrwilson , for overloads accepting audio streams, accepting an audioFormat parameter in lieu of audioFilename would be more appropriate. The value for audioFilename may be derived internally based on the audioFormat. This will avoid any confusion for the API user. What do you say?

audioFormat parameter could be an enum of supported formats or a string For ex. enum AudioFormat {MP3, WAV, FLAC, ... }

public virtual ClientResult TranscribeAudio(Stream audio, AudioFormat audioFormat, AudioTranscriptionOptions options = null, CancellationToken cancellationToken = default(CancellationToken))

cjkarande commented 1 day ago

@trrwilson , for overloads accepting audio streams, accepting an audioFormat parameter in lieu of audioFilename would be more appropriate. The value for audioFilename may be derived internally based on the audioFormat. This will avoid any confusion for the API user. What do you say?

audioFormat parameter could be an enum of supported formats or a string For ex. enum AudioFormat {MP3, WAV, FLAC, ... }

public virtual ClientResult TranscribeAudio(Stream audio, AudioFormat audioFormat, AudioTranscriptionOptions options = null, CancellationToken cancellationToken = default(CancellationToken))

Hello @trrwilson , so can we have an overload as proposed?