lowerquality / gentle

gentle forced aligner
https://lowerquality.com/gentle/
MIT License
1.43k stars 295 forks source link

Long files crash Gentle DMG #82

Open strob opened 8 years ago

strob commented 8 years ago

Reported on Twitter by @jarm:

struggling with Gentle using large files (window dies), should I chop them up?

same again with just audio (80MB; 2hrs47m). I tried with smaller clip (8.7MB;4m) and that works fine.

seems to hang when transcription finishes and page layout changes

I wasn't watching moments crashes happened, but both seem to be when transcription ends. this is w/ offline version btw.

strob commented 8 years ago

I would like to know:

./Applications/gentle.app/Contents/MacOS/gentle

do you see any errors?

Thanks!

natelawrence commented 8 years ago

A few notes from a primarily Windows user: My belief is that the filesize is the critical factor, rather than any specific duration of the media (as your RAM questions suggest).

When the hosted version at gentle-demo.lowerquality.com is up:

When I run the DMG local version on a family member's Mac (4GB iMac 20" Early 2009 running MacOS 10.11.5):

When I use the Docker container on Windows (4GB laptop):

I will report back when I get chance to observe what RAM usage is like during upload, but I do not expect it to be very high.

The process of transcription, however (by which I assume we mean the step with the progress bar directly following ffmpeg's encoding) runs pitifully slowly on both of my local machines. On Windows Docker's VirtualBox instance consumes only about 25% of my CPU but as much RAM as it can obtain, which gives me the impression that Gentle needs a fair bit of the audio loaded into RAM at a time to do any work.

strob commented 8 years ago

Thanks for this detailed report!

I'm not confident in the upload behaving sanely. The server is written in Twisted, which has known problems with large uploads. That is, uploads may use 2-3x the RAM of the file. Would be great to try this.

As for the transcription, that's a "fixed cost" with RAM, but it's large. Running Gentle from the commandline exposes a --ntranscriptionthreads argument, which defaults to two; each transcription thread could use 2-3gb RAM.

The audio file is not loaded into RAM during transcription.

Using Gentle for full-transcription on a machine with 4gb RAM is unlikely to ever be a great experience, but I would like for the uploading to work better than it currently does. That may mean using twisted-largerrequest, as suggested above, or it might mean implementing a chunked uploader.