Algorithm-Arena / weekly-challenge-5-copy-pasta

3 stars 0 forks source link

Submission - Chinese Whisper AI #5

Open tobua opened 7 months ago

tobua commented 7 months ago

Ready with another challenge 😀

Live Page: chinese-whisper-ai.vercel.app Code: tobua/chinese-whisper-ai Videos: 𝕏 Post

Voice to Text

https://github.com/Algorithm-Arena/weekly-challenge-5-copy-pasta/assets/15127551/fcbae898-5109-43bb-931a-c0805e862164

Text to Image

https://github.com/Algorithm-Arena/weekly-challenge-5-copy-pasta/assets/15127551/9ad86eb3-8edd-4e84-9b32-00edbbc9c4b3

Description

When the page loads randomly a Text, Voice or Image input is shown. Once the user enters something or uploads a file it can be converted into the next format. Conversion is handled by an Edge function using the OpenAI API. For these conversions most models are used: DALL·E 3, Whisper, GPT-4 Vision and TTS. This way it's possible to infinitely loop around letting the AI suprise you with it's creations. At any step it's possible for the user to download the current result and enter their own input. To make the time between the server-side conversions pass faster a big animated custom loader in SVG is rendered. In their respective forms images can be uploaded or added per drag-and-drop. When access is given to the microphone it's possible to record speech. While full transcription happens on the server using Whisper there is a preview of what's currently spoken if the respective SpeechRecognition API is available in the browser.

Screenshots

screenshot a bear in a coffee shop eating spaghetti Screenshot 2024-02-19 at 08 22 15
vjeux commented 7 months ago

Thanks for submitting, do you mind recording a small video on how it is being used? I want to make sure that when people come back to it in a few months / years they can see how it looks even though the integrations will probably no longer work. Thanks!

tobua commented 7 months ago

@vjeux Thanks for the reminder, I was just filming them and trying to stay below 10MB so I can attach the video directly in this post, now the videos are there!

tobua commented 6 months ago

Please note that I've added a couple of small bug fixes and improvements today. Keep this in mind when making your judgement. Also, unfortunately the Edge functions on Vercel frequently reach their 10s timeout before the OpenAI response arrives. Locally, it works fine.