zebreus / replicate-api

A TypeScript client library for the replicate.com API
https://github.com/Zebreus/replicate-api
MIT License
23 stars 2 forks source link

How to send audio files? #5

Closed pie6k closed 1 year ago

pie6k commented 1 year ago

Hey, I'm trying to find an example of sending an audio file into some model (Whisper).

Should I serialize it into a string (kinda tricky as files might be quite large), use form-data, etc?

The model I want to use - https://replicate.com/openai/whisper

zebreus commented 1 year ago

Hmm, I have not thought of that yet.

I will try to figure out a sane solution and add an example to the Readme

zebreus commented 1 year ago

According to their API documentation, replicate accepts files only as URLs. I added a loadFile function for loading and encoding local files as base64 URLs. This only works for relativly small files (<100 MB), bigger files should probably be uploaded somewhere and loaded from there.

Transcribe audio with whisper

You can create a new prediction for the openai/whisper model and wait for the result with:

const prediction = await predict({
  model: "openai/whisper", // The model name
  input: {
    audio: await loadFile("./testaudio.mp3"), // Load local file as base64 dataurl
    // audio: "https://raw.githubusercontent.com/zebreus/replicate-api/master/testaudio.mp3", // Load from a URL
    model: "base",
  }, // The model specific input
  token: "...", // You need a token from replicate.com
  poll: true, // Wait for the model to finish
})

console.log(prediction.output.transcription)
// Transcribed text
zebreus commented 1 year ago

Related issue for replicate-js nicholascelestin/replicate-js#33

zebreus commented 1 year ago

I just added another function for uploading bigger files to replicate before starting the prediction. It is based around the file upload API endpoint used by the replicate.com web interface. That endpoint is only supposed to be used by their web interface, so it might stop working sometime.

Example for the uploadFile function

const prediction = await predict({
  input: {
    audio: await uploadFile("./testaudio.mp3"), // Upload file to replicate.com website.
    model: "base",
  }, // The model specific input
  token: "...", // You need a token from replicate.com
  poll: true, // Wait for the model to finish
})