sashabaranov / go-openai

OpenAI ChatGPT, GPT-3, GPT-4, DALL·E, Whisper API wrapper for Go
Apache License 2.0
8.6k stars 1.29k forks source link

Add support for word-level audio transcription timestamp granularity #733

Closed agcom closed 1 month ago

agcom commented 1 month ago

Describe the change Add support for word-level audio transcription timestamp granularity.

Provide OpenAI documentation link

Describe your solution Added AudioRequest.TimestampGranularities and AudioResponse.Words fields.

Tests Filled AudioRequest.TimestampGranularities field in the existing tests' audio requests.

Additional context

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 98.68%. Comparing base (774fc9d) to head (456ceba). Report is 10 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #733 +/- ## ========================================== + Coverage 98.46% 98.68% +0.22% ========================================== Files 24 24 Lines 1364 1140 -224 ========================================== - Hits 1343 1125 -218 + Misses 15 9 -6 Partials 6 6 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

agcom commented 1 month ago

I would have proposed separating transcription and translation request/response types, but that would have been a breaking change.

sashabaranov commented 1 month ago

Thank you for working on this!

sashabaranov commented 1 month ago

@agcom the PR looks good to me in its current state!

I would have proposed separating transcription and translation request/response types, but that would have been a breaking change.

We can discuss it in a separate PR, also totally open to look at a sketch of how this might look like 🙌🏻

agcom commented 1 month ago

@sashabaranov alright then, ready to review. I just wanted to test it out in our development server before marking it for review (did it, and it works fine).

We can discuss it in a separate PR, also totally open to look at a sketch of how this might look like 🙌🏻

May god bless me with more code refactoring tasks so I would work on this :smile:.