ecornell / ai-tools-ahk

AI Tools - AutoHotkey - Enable global hotkeys to run custom OpenAI prompts on text in any window.
MIT License
54 stars 7 forks source link

Add Audio Transcription Capability? #6

Open doxgt opened 6 months ago

doxgt commented 6 months ago

Greetings.

I have been able to use the OpenAI Python library to send audio recordings to OpenAI for transcription (https://platform.openai.com/docs/guides/speech-to-text/quickstart).

However, I am wondering about WinHTTP based interactions with OpenAI as you demonstrated in your utility. And I'd always prefer AHK to tinkering with Python.

I am wondering if you happen to have any plan to add a module for uploading audio file for transcription.

If not, could you point me to the way on how to interface with the API in terms of uploading audio files? I kind of figured that I would be doing something along the line of ComObject("WinHttp.WinHttpRequest.5.1").SetRequestHeader("Content-Type", "multipart/form-data").

Instead of "https://api.openai.com/v1/chat/completions", the speech API URL is at "https://api.openai.com/v1/audio/transcriptions; and the API Model would be "whisper-1".

Then I am not sure where to go from there.

Many thanks in advance!

doxgt commented 6 months ago

I figured out how to use cURL to send audio files. No further actions needed here. Thanks for taking a look if you did.