emcf / thepipe

Extract markdown and images from URLs, PDFs, docs, slides, and more, ready for multimodal LLMs. ⚡
https://thepi.pe
MIT License
814 stars 61 forks source link

Audio transcript extraction #8

Closed emcf closed 2 months ago

emcf commented 2 months ago

Looking to support mp3, wav

Audio is not standard in commercial multimodal models today in 2024. Because of this, I am also looking to transcribe audio to text, probably via Whisper.

emcf commented 2 months ago

FIxed by #12