GenieFramework / StippleUI.jl

StippleUI is a library of reactive UI elements for Stipple.jl.
MIT License
84 stars 15 forks source link

Example: Speech-to-text / mediaRecorder in browser #131

Open svilupp opened 4 months ago

svilupp commented 4 months ago

Thanks to Helmut for suggesting using an uploader!

Desired behavior:

Below is my hacky example. I've used OpenAI Whisper API, but you can use local models like Whisper.jl (see Whisper transcriber demo). The trick was to use a hidden uploader that does the heavy lifting for you.

MWE

module App
using GenieFramework
using GenieFramework.JSON3
using GenieFramework.Stipple.HTTP
using GenieFramework.Stipple.ModelStorage.Sessions# for @init
using Base64
@genietools

@appname Recorder

function openai_whisper(file)
    url = "https://api.openai.com/v1/audio/transcriptions"
    headers = ["Authorization" => "Bearer $(ENV["OPENAI_API_KEY"])"]
    form = HTTP.Forms.Form(Dict(
        "file" => open(file), "model" => "whisper-1"))
    response = HTTP.post(url, headers, form)
    transcription = JSON3.read(response.body)["text"]
    return transcription
end

@app begin
    @in input = ""
    @in audio_chunks = []
    @in mediaRecorder = nothing
    @in is_recording = false
    @onchange isready begin
        @info "I am alive!"
    end
    @event uploaded begin
        @info "File uploaded!!"
        @info params(:payload)["event"]
        notify(__model__, "File uploaded!")
    end
    @onchange fileuploads begin
        if !isempty(fileuploads)
            @info "File was uploaded: " fileuploads["path"]
            filename = base64encode(fileuploads["name"])
            try
                fn_new = fileuploads["path"] * ".wav"
                mv(fileuploads["path"], fn_new; force = true)
                input = openai_whisper(fn_new)
                rm(fn_new; force = true)
            catch e
                @error "Error processing file: $e"
                notify(__model__, "Error processing file: $(fileuploads["name"])")
                "FAIL!"
            end
            fileuploads = Dict{AbstractString, AbstractString}()
        end
    end
end

function ui()
    [
        h3("Speech-to-text API"),
        textfield(:input, label = "Input", v__model = :input),
        btn(@click("toggleRecording"),
            label = R"is_recording ? 'Stop' : 'Record'",
            color = R"is_recording ? 'negative' : 'primary'"
        ),
        uploader(multiple = false,
            maxfiles = 10,
            autoupload = true,
            hideuploadbtn = true,
            label = "Upload",
            nothumbnails = true,
            ref = "uploader",
            style = "display: none; visibility: hidden;",
            @on("uploaded", :uploaded)
        )
    ]
end

@methods begin
    raw"""
    async toggleRecording() {
        if (!this.is_recording) {
          this.startRecording()
        } else {
          this.stopRecording()
        }
    },
    async startRecording() {
      navigator.mediaDevices.getUserMedia({ audio: true })
        .then(stream => {
          this.is_recording = true
          this.mediaRecorder = new MediaRecorder(stream);
          this.mediaRecorder.start();
          this.mediaRecorder.onstop = () => {
            const audioBlob = new Blob(this.audio_chunks, { type: 'audio/wav' });
            this.is_recording = false;

            // upload via uploader
            const file = new File([audioBlob], 'test.wav');
            this.$refs.uploader.addFiles([file], 'test.wav');
            this.$refs.uploader.upload(); // Trigger the upload
            console.log("Uploaded WAV");
            this.$refs.uploader.reset();
            this.audio_chunks=[];

          };
          this.mediaRecorder.ondataavailable = event => {
            this.audio_chunks.push(event.data);
          };
        })
        .catch(error => console.error('Error accessing microphone:', error));
    },
    stopRecording() {
      if (this.mediaRecorder) {
        this.mediaRecorder.stop();
      } else {
        console.error('MediaRecorder is not initialized');
      }
    }
"""
end

@page("/",ui())

up(browser = false)

end

A few notes:

Questions:

EDIT: It doesn't work on mobile devices, the JS runs so it must be something in the uploader workflow (it never activates on the Julia side).

PGimenez commented 4 months ago

This is great @svilupp ! I tried but I think the recording didn't work although my system says it's recording audio. I'll try on some other browser (I'm on Chrome)

I've created a repo for this demo here https://github.com/BuiltWithGenie/SpeechToText and credited you

svilupp commented 4 months ago

This is great @svilupp ! I tried but I think the recording didn't work although my system says it's recording audio. I'll try on some other browser (I'm on Chrome)

I've created a repo for this demo here https://github.com/BuiltWithGenie/SpeechToText and credited you

Interesting. It worked fine for me on two laptops with Chrome, but not on a phone (maybe the same issue)?

At which point does it fail? If you don’t see the “Uploaded WAV” in browser console, it’s the JS on the client side. If you don’t see the logs in Julia REPL, it’s the Stipple/server side. The latter is not working for me on mobile.