sachac / subed

subed is a subtitle editor for Emacs
177 stars 15 forks source link

[bug] Content of the waveform for the last subtitle is not properly centered #68

Open rodrigomorales1 opened 2 weeks ago

rodrigomorales1 commented 2 weeks ago

How I reproduced the bug

I created a sample audio file /tmp/subed/a.mp3 using the command shown below. The audio plays beeps at times 00:01, 00:03, 00:05 and 00:07 and 00:09. The screenshot shown below shows the waveform in Audacity version 2.4.2.

$ ffmpeg \
  -v error \
  -y \
  -f lavfi -i "sine=frequency=1000:duration=10" \
  -af "volume=0:enable='between(t,0,1)',\
volume=0:enable='between(t,1.2,3)',\
volume=0:enable='between(t,3.2,5)',\
volume=0:enable='between(t,5.2,7)',\
volume=0:enable='between(t,7.2,9)',\
volume=0:enable='between(t,9.2,10)'" \
  /tmp/subed/a.mp3

image

I inserted the content of the code block below into the file /tmp/subed/a.srt.

1
00:00:00,000 --> 00:00:02,000
This is subtitle no. 1

2
00:00:02,000 --> 00:00:04,000
This is subtitle no. 2

3
00:00:04,000 --> 00:00:06,000
This is subtitle no. 3

4
00:00:06,000 --> 00:00:08,000
This is subtitle no. 4

5
00:00:08,000 --> 00:00:10,000
This is subtitle no. 5

I cloned the subed git repository to make sure that I was using the latest changes in subed.

$ git clone --branch main --depth 1 https://github.com/sachac/subed /tmp/subed/subed-repo

I built GNU Emacs 29.3 from source. For convenience to reproduce the bug, I wrote the following command to start Emacs, visit the file /tmp/subed/a.srt and enable subed-waveform-minor-mode.

$ EMACSLOADPATH= EMACSNATIVELOADPATH= ./src/emacs \
             --quick \
             --no-splash \
             --eval "(add-to-list 'load-path \"/tmp/subed/subed-repo/subed/\")" \
             --eval "(require 'subed-srt)" \
             --eval "(add-hook 'subed-mode-hook 'subed-enable-sync-player-to-point)" \
             --eval "(add-hook 'subed-mode-hook 'subed-enable-loop-over-current-subtitle)" \
             --eval "(find-file \"/tmp/subed/a.srt\")" \
             --eval "(subed-waveform-minor-mode 1)"

When I executed the command, an Emacs window and a mpv window were opened and the waveform for the first subtitle (the subtitle the point was on) was shown. See screenshot below.

image

I moved the point to the last subtitle (i.e. This is subtitle no. 5).

What happened?

The waveform image which was inserted into the buffer showed the waveform for the beep further to the right than its actual position. See screenshot below.

image

The screenshot shown below shows the buffer after calling subed-waveform-toggle-show-all. Note that the waveform for the beep is centered for all subtitles except for the last one.

image

What should have happened?

The waveform for the beep in the last subtitle should be shown at the middle of the image since the beep is approximately played at 00:09 and the subtitle spans from 00:08 to 00:10.

System information

ELISP> (version)
"GNU Emacs 29.3 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.33, cairo version 1.16.0)
 of 2024-06-20"
$ ffmpeg -version                                                           
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil      56. 70.100 / 56. 70.100
libavcodec     58.134.100 / 58.134.100
libavformat    58. 76.100 / 58. 76.100
libavdevice    58. 13.100 / 58. 13.100
libavfilter     7.110.100 /  7.110.100
libswscale      5.  9.100 /  5.  9.100
libswresample   3.  9.100 /  3.  9.100
libpostproc    55.  9.100 / 55.  9.100
sachac commented 2 weeks ago

Thanks for a thorough investigation! I wonder what ffmpeg does to make the image and how to adjust the math if it's not quite what we expect... It might take me a while to get to this (limited computer time), so if you or anyone happen to be able to fix it, that would be amazing.

On Thu, Jun 20, 2024, 00:48 Rodrigo Morales @.***> wrote:

How I reproduced the bug

I created a sample audio file /tmp/subed/a.mp3 using the command shown below. The audio plays beeps at times 00:01, 00:03, 00:05 and 00:07 and 00:09. The screenshot shown below shows the waveform in Audacity version 2.4.2.

ffmpeg \ -v error \ -y \ -f lavfi -i "sine=frequency=1000:duration=10" \ -af "volume=0:enable='between(t,0,1)',\volume=0:enable='between(t,1.2,3)',\volume=0:enable='between(t,3.2,5)',\volume=0:enable='between(t,5.2,7)',\volume=0:enable='between(t,7.2,9)',\volume=0:enable='between(t,9.2,10)'" \ /tmp/subed/a.mp3

I inserted the content of the code block below into the file /tmp/subed/a.srt.

1 00:00:00,000 --> 00:00:02,000 This is the subtitle no. 1

2 00:00:02,000 --> 00:00:04,000 This is the subtitle no. 2

3 00:00:04,000 --> 00:00:06,000 This is the subtitle no. 3

4 00:00:06,000 --> 00:00:08,000 This is the subtitle no. 4

5 00:00:08,000 --> 00:00:10,000 This is the subtitle no. 5

I inserted the content of the code block below to ~/.config/emacs/init.el.

(use-package subed :hook ((subed-mode . subed-enable-sync-player-to-point) (subed-mode . subed-enable-loop-over-current-subtitle)))

I launched Emacs by executing the command shown below.

emacs --no-splash /tmp/subed/a.srt

I pressed M-x subed-waveform-minor-mode RET.

I moved the point to the last subtitle (i.e. subtitle no. 5)

What happened?

The waveform image which was inserted into the buffer showed the waveform for the beep further to the right than its actual position. See screenshot below.

What should have happened?

The waveform for the beep should be shown at the middle of the waveform image because the beep is approximately played at 00:09 and the subtitle spans from 00:08 to 00:10.

— Reply to this email directly, view it on GitHub https://github.com/sachac/subed/issues/68, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACD7EQZO5ELGMTUIRDSCH3ZIJNKBAVCNFSM6AAAAABJTFJF4GVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3DGNJQGA2DGNQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

rodrigomorales1 commented 2 weeks ago

I investigated a little bit more on this issue. Here's a note to myself or others that might be interested to fix this in the future.

The problem is that the parameter -to is passed a timestamp that is greater than the duration of the media file. ffmpeg handles this by stretching the waveforms for which audio does exist. Therefore, the stretching in the waveform for the last subtitle is greater than the other waveforms, but the bars in the waveform are placed to highlight the segment in the middle.

I found out this by adding some (message ...) statements in the functions subed-waveform--from-file and subed-waveform--update-bars.

$ cd /tmp/subed-repo && git diff
diff --git a/subed/subed-waveform.el b/subed/subed-waveform.el
index abb5a98..7daf670 100644
--- a/subed/subed-waveform.el
+++ b/subed/subed-waveform.el
@@ -269,6 +269,7 @@ and HEIGHT are dimensions in pixels."
             "-frames:v" "1"
             "-c:v" "png"
             "-f" "image2" "-"))))
+    (message "args: %s" args)
     (with-temp-buffer
       (apply 'call-process subed-waveform-ffmpeg-executable nil t nil args)
       (encode-coding-string (buffer-string) 'binary))))
@@ -373,6 +374,8 @@ If POSITION is nil, remove the bar."
                     stop
                     (overlay-get overlay 'waveform-start)
                     (overlay-get overlay 'waveform-stop))))
+    (message "start-pos: %s" start-pos)
+    (message "stop-pos: %s" stop-pos)
     (subed-waveform--move-bar :start start-pos overlay)
     (subed-waveform--move-bar :stop stop-pos overlay))
   (subed-waveform--update-current-bar subed-mpv-playback-position overlay))
#+end_example

The following code block contains the contents of the buffer *Messages* after opening the file /tmp/subed/a.srt (see my first message in this issue) and moving the point through all subtitles.

For information about GNU Emacs and the GNU system, type C-h C-a.
a.srt has auto save data; consider M-x recover-this-file
Looping over 00:00:00,000 - 00:00:02,000
Enabled looping over current subtitle
Enabled syncing playback position to point
args: (-accurate_seek -ss 00:00:00.000 -to 00:00:04.000 -i /tmp/subed/a.mp3 -loglevel 0 -filter_complex volume=2.0,showwavespic=s=648x46:colors=gray -frames:v 1 -c:v png -f image2 -)
start-pos: 0.00%
stop-pos: 50.00%
args: (-accurate_seek -ss 00:00:00.000 -to 00:00:04.000 -i /tmp/subed/a.mp3 -loglevel 0 -filter_complex volume=2.0,showwavespic=s=648x46:colors=gray -frames:v 1 -c:v png -f image2 -)
start-pos: 0.00%
stop-pos: 50.00%
Looping over 00:00:00,000 - 00:00:02,000
Waiting for connection...
args: (-accurate_seek -ss 00:00:00.000 -to 00:00:06.000 -i /tmp/subed/a.mp3 -loglevel 0 -filter_complex volume=2.0,showwavespic=s=648x46:colors=gray -frames:v 1 -c:v png -f image2 -)
start-pos: 33.33%
stop-pos: 66.67%
Looping over 00:00:02,000 - 00:00:04,000
args: (-accurate_seek -ss 00:00:02.000 -to 00:00:08.000 -i /tmp/subed/a.mp3 -loglevel 0 -filter_complex volume=2.0,showwavespic=s=648x46:colors=gray -frames:v 1 -c:v png -f image2 -)
start-pos: 33.33%
stop-pos: 66.67%
Looping over 00:00:04,000 - 00:00:06,000
args: (-accurate_seek -ss 00:00:04.000 -to 00:00:10.000 -i /tmp/subed/a.mp3 -loglevel 0 -filter_complex volume=2.0,showwavespic=s=648x46:colors=gray -frames:v 1 -c:v png -f image2 -)
start-pos: 33.33%
stop-pos: 66.67%
Looping over 00:00:06,000 - 00:00:08,000
args: (-accurate_seek -ss 00:00:06.000 -to 00:00:12.000 -i /tmp/subed/a.mp3 -loglevel 0 -filter_complex volume=2.0,showwavespic=s=648x46:colors=gray -frames:v 1 -c:v png -f image2 -)
start-pos: 33.33%
stop-pos: 66.67%
Looping over 00:00:08,000 - 00:00:10,000

Note that in the information shown above Looping over 00:00:08,000 - 00:00:10,000, the waveform is created for the time 00:00:06.000-00:00:12.000 even though the media file /tmp/subed/a.jpg is 10 seconds long (see first message to learn how to create this file). Since ffmpeg tries to build a waveform from 00:00:06.000 to 00:00:12.000, but there are only waves between 00:00:06.000-00:00:10.000, the waves will stretch to fit the dimension showwavespic=s=648x46. Ideally, ffmpeg should set start-pos and stop-pos to set the bars accordingly when the waveforms have been stretched.

sachac commented 6 hours ago

Thinking out loud: we could use ffprobe to get the actual duration of the file, and then we can adjust the stop-ms to consider that duration, possibly adjusting the width of the image as well. We'll probably want to cache that duration so that we're not calculating it all the time, so that would be a good buffer-local variable.

rndusr commented 5 hours ago

Since I have the appropriate ffprobe command at hand, here it is:

ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 video.mkv

That should print the duration in seconds as a float.

sachac commented 2 hours ago

@rodrigomorales1 Would it be easy for you to test the https://github.com/sachac/subed/tree/waveform-long-cue branch, or would you like me to merge it into the main branch?

rodrigomorales1 commented 2 hours ago

@sachac I'll give that branch a try, write some tests and make a pull request on that branch.