Robitx / gp.nvim

Gp.nvim (GPT prompt) Neovim AI plugin: ChatGPT sessions & Instructable text/code operations & Speech to text [OpenAI, Ollama, Anthropic, ..]
MIT License
787 stars 67 forks source link

Sox issues with MP3 format and recording missing sound #69

Closed mikebwilliams closed 9 months ago

mikebwilliams commented 10 months ago

Hi, awesome plugin, thanks for making it.

I've had a couple of issues with sox that I wanted to note. On ubuntu, I did not have the MP3 format installed by default, it may be worth testing for MP3 support in the health check.

Next, I have had a terrible time getting sox to not cut off the beginning and end of my recorded audio. It doesn't appear to be anything to do with your code.

I switched to arecord and I don't miss audio anymore.

    M._H.process(nil, "arecord", {
        --[[ "-q", ]]
        -- single channel
        "-c",
        "1",
        "-f",
        "S16_LE",
        "-r",
        "48000",
        -- output file
        M.config.whisper_dir .. "/rec.wav",
    }, function(code, signal, _, _)
        close()

        if code and code ~= 0 and code ~= 1 then
            M.error("Sox exited with code and signal: " .. code .. " " .. signal)
            return
        end

        if not continue then
            return
        end

        vim.schedule(function()
            transcribe()
        end)
    end)
Robitx commented 10 months ago

@mikebwilliams Hey, thanks for the report and exploring the solution! The latest version uses arecord if possible and healthcheck if sox has mp3 support (and I've mentioned libsox-fmt-mp3 in Readme).

I liked the SoX because it should be able to do all what was needed across platforms, but you're second person who reported these cut offs.

When I find some time, I also have to check the behavior on Mac and switch to https://ffmpeg.org/ffmpeg-devices.html#avfoundation if necessary.