SystemSculpt / obsidian-systemsculpt-ai

Enhance your Obsidian App experience with AI-powered tools for note-taking, task management, and much, MUCH more.
MIT License
90 stars 13 forks source link

Transcribe Swiss German #79

Open Robpollard opened 6 days ago

Robpollard commented 6 days ago

Describe the enhancement

I have a peculiar problem: Since swiss german by itself is not its own language but consists of a myriad of dialects, we have been overlooked by many of the voice-to-text improvements. Ofc i could speak in german or english into the mic, but it's not the same if you have to think/speak in a different language, as you can imagine. A swiss researcher made an absolutely great Github-project called NoScribe (https://github.com/kaixxx/noScribe) and it's the only product I've found that transcribes swiss german flawlessly. No really, its crazy - I've seen voice files with 6 different people speaking different dialects and NoScribe transcribes (to german) perfectly. I'm dreaming of being able to incorporate this into Obsidian, this would be my feature request: Is ist possible to find a model that works with swiss voice? I've tested it with the you one provide in SySc but honestly, it's not great 😅

wwwebweber commented 3 days ago

The reason for this is that NoScripe uses the whisper-large-v2 model. The models on the OpenAi and Groq servers use whisper-large-v3. This newer model recognizes Swiss German much worse than V2. SystemSculpt accesses the servers. So your request actually goes to OpenAi or Groq.

Robpollard commented 2 days ago

Appreciate the insight, thanks. Any idea why the newer models got worse with swiss german? Doesn't make sense at first glance, does it?