djmango / obsidian-transcription

Obsidian plugin to create high-quality transcriptions from markdown linked audio files
https://swiftink.io
MIT License
156 stars 17 forks source link

Time-Stamp intervals are too short #45

Closed BeverlyLaurelBLVD closed 8 months ago

BeverlyLaurelBLVD commented 8 months ago

Could we get a feature to adjust the interval for the time-stamping?

When it's time-stamping every 2 seconds it gets very messy and hard to work with.

If we could have the option to have it time-stamp in Obsidian every 10, 20, 30 seconds or something custom, that would be amazing and would make note-taking much faster & easier.

I love the work you've done.

bscholer commented 8 months ago

This kind of thing would be straightforward for the Whisper ASR side of things using word-level timestamps introduced in #46 with some tweaks to segmentsToTimestampedString().

To get this working for Swiftink though, it would need to return word-level timestamps. How feasible is this @djmango ?

bscholer commented 8 months ago

Just implemented this for Whisper ASR in 035b20783e501a5ddb413cc82efb018ab965a228. I will cherry-pick and make a new PR once #46 is merged, as these changes are on top of #46 (I wasn't thinking when I started working on this project, oops!).

This implementation will work with or without word timestamps from Swiftink, but if it could return word timestamps it will yield a better result.

Examples (using 10s interval)

Whisper ASR

00:00 - 00:10: Hey, so this is a voice recording that has some of me talking in it, but then it also has some silence right here I'll give it like 15 seconds or so.
00:24 - 00:30: All right. Yeah, we're back. You're not talking or now I am talking Now
00:30 - 00:39: I'm gonna do a little test of just doing some like other background noises and see what that does So here we go
00:52 - 00:53: You

Swiftink

00:00 - 00:07: Hey, so this is a voice recording that has some of me talking in it, but then it also has some silence right here
00:07 - 00:10: I'll give it like 15 seconds or so
00:24 - 00:28: All right, yeah, we're back you're not talking or now I am talking
00:30 - 00:36: Now I'm gonna do a little test of just doing some like other background noises and see what that does
00:37 - 00:39: So here we go
djmango commented 8 months ago

Fixed by @bscholer, and released, thanks!