Import subtitle file, add auto-bookmarks

LuteOrg / lute-v3

LUTE = Learning Using Texts: learn languages through reading.

https://luteorg.github.io/lute-manual/

MIT License

498 stars 46 forks source link

Import subtitle file, add auto-bookmarks #55

Open jzohrab opened 11 months ago

jzohrab commented 11 months ago

From Discord:

It'd be great if you could upload an .srt or other subtitle file with an audio file and have it convert it to a txt for reading, but also add bookmarks for the different pages (or even for all the sentences and next to them, there's a little button that, basically, says "skip to this sentence in the audio"). This would make Lute amazing to use with anything audio based alongside Whisper getting better and better.

Challenges I can see with this request:

bookmarks aren't associated with pages or sentences, so there will be many many bookmarks in the audio timeline and no clear way to jump to the text
during parsing, the timestamp data isn't stored, it's just another token. This could potentially be worked around with the base parser doing a preliminary pass to get timestamps, and then the actual parsers being called for each section between stamps. This is a big change from the current method, but perhaps is doable. I'm not sure of the payoff, but I'd need to work on a new language to fully understand the ins and outs.

cblanken commented 4 months ago

I love this idea. I recently got into using whisper for subtitle generation. I think this feature would be super useful for working through audiobooks and podcasts.

I'll try to look into the all the audio stuff sometime soon, but my gut says it would make a lot more sense to persist the timestamps in another table, so you could easily jump to anywhere in a longer audio according to the the associated subtitle.

jzohrab commented 3 months ago

I'm really not sure how to link up audio timestamps with arbitrary sentences. With Text database entries, it's possible, but sentences aren't created until the page is first read.

The current audio timestamps are stored as denormalized data in the books table, as BkAudioBookmarks TEXT NULL. These are just shown on the player as bookmark bars, with no reference to the texts entries in the book. There's no tie between the texts entries and the bookmarks, since the two aspects of text and sound/player are totally decoupled. Having them automatically be in sync -- e.g. so that turning a page automatically changes the player, or fast-forwarding changes your current page -- would be quite hard for the user to manage, I think!

But maybe you have a clever idea that makes sense for you. I've never needed this idea, and so haven't put the hours into thinking about it.