geekodour / wscribe-editor

web based editor for subtitles and transcripts
https://wscribe-editor.geekodour.org/
MIT License
109 stars 8 forks source link

UI gets slow with media files of ~10m duration #1

Open balta2ar opened 1 year ago

balta2ar commented 1 year ago

Hi! I'm happy to discover this project as it is nearly what I was thinking of building myself. I'm learning a foreign language, and having a tool that can convert audio to text and display words and with playing the audio at the same time is immensely useful in language learning.

The only problem is that the UI gets slow and unresponsive when I load a 10m long clip. Initially I tried loading a 1h long podcast, but the program didn't seem to handle it well, unfortunately, so I had to split it. I was thinking 10m is a reasonable size. Or is it?

P.S. Your public evergreen notes are awesome, thanks for sharing them!

geekodour commented 1 year ago

Hi! Thanks for the kind words and letting me know about the issue. 10m or 1h should not be a problem, I've tested it on Linux:Mozilla Firefox 114.0.2 It works fine with either case.

Can you tell me more about which browser you're running it on? I can possibly try replicating it and hopefully fix the issue for you :)

On the other hand, I discovered the following today which might be more helpful to you than wscribe! Check the following out. (I've also linked them in the readme, https://github.com/geekodour/wscribe-editor#related-projects)

geekodour commented 1 year ago

Also can you share the 10m media file itself?

balta2ar commented 1 year ago

I'm using Chrome 115.0.5790.110, but I must admit I have quite a number of tabs open... At the same time, 1h file never finished loading at all though.

I've put media files with transcriptions here, 1h and several 10m segments: https://disk.yandex.com/d/G6ZKhnAnMZbCoA

Thank you for the recommendation, I'll have a look! After I found wscribe, I also looked into subtitle editors on Linux -- thinking maybe that's what I need. But those that I've tried (gaupol, aegisub, subtitleedit) were not quite ergonomic and didn't seem to support word-level navigation/rewind. I even tried integrating google docs with VLC by monitoring clipboard and checking if a timestamp-looking word is currently selected, and then seeking to that timestamp in VLC, e.g.:

17m21s 29 kilometer i timen, det er jo litt vildt,
17m24s hvis du hadde rent skolten din,

but that approach also had its limitations.

geekodour commented 1 year ago

Okay, I checked this, this seems to be a problem specific to chrome. Firefox it works fine. Think it's mostly an issue related to loading/decoding the audio waveform. I'll look into this. Thanks for sending the details!

EngageWisdom commented 1 year ago

Another great option is Gentle: https://github.com/lowerquality/gentle

They have a full GUI native Mac OS app as well: https://github.com/lowerquality/gentle/releases/tag/0.11.0

Here's a demo: https://lowerquality.com/gentle/