Open rodrigomorales1 opened 2 weeks ago
Thanks for a thorough investigation! I wonder what ffmpeg does to make the image and how to adjust the math if it's not quite what we expect... It might take me a while to get to this (limited computer time), so if you or anyone happen to be able to fix it, that would be amazing.
On Thu, Jun 20, 2024, 00:48 Rodrigo Morales @.***> wrote:
How I reproduced the bug
I created a sample audio file /tmp/subed/a.mp3 using the command shown below. The audio plays beeps at times 00:01, 00:03, 00:05 and 00:07 and 00:09. The screenshot shown below shows the waveform in Audacity version 2.4.2.
ffmpeg \ -v error \ -y \ -f lavfi -i "sine=frequency=1000:duration=10" \ -af "volume=0:enable='between(t,0,1)',\volume=0:enable='between(t,1.2,3)',\volume=0:enable='between(t,3.2,5)',\volume=0:enable='between(t,5.2,7)',\volume=0:enable='between(t,7.2,9)',\volume=0:enable='between(t,9.2,10)'" \ /tmp/subed/a.mp3
I inserted the content of the code block below into the file /tmp/subed/a.srt.
1 00:00:00,000 --> 00:00:02,000 This is the subtitle no. 1
2 00:00:02,000 --> 00:00:04,000 This is the subtitle no. 2
3 00:00:04,000 --> 00:00:06,000 This is the subtitle no. 3
4 00:00:06,000 --> 00:00:08,000 This is the subtitle no. 4
5 00:00:08,000 --> 00:00:10,000 This is the subtitle no. 5
I inserted the content of the code block below to ~/.config/emacs/init.el.
(use-package subed :hook ((subed-mode . subed-enable-sync-player-to-point) (subed-mode . subed-enable-loop-over-current-subtitle)))
I launched Emacs by executing the command shown below.
emacs --no-splash /tmp/subed/a.srt
I pressed M-x subed-waveform-minor-mode RET.
I moved the point to the last subtitle (i.e. subtitle no. 5)
What happened?
The waveform image which was inserted into the buffer showed the waveform for the beep further to the right than its actual position. See screenshot below.
What should have happened?
The waveform for the beep should be shown at the middle of the waveform image because the beep is approximately played at 00:09 and the subtitle spans from 00:08 to 00:10.
— Reply to this email directly, view it on GitHub https://github.com/sachac/subed/issues/68, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACD7EQZO5ELGMTUIRDSCH3ZIJNKBAVCNFSM6AAAAABJTFJF4GVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3DGNJQGA2DGNQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I investigated a little bit more on this issue. Here's a note to myself or others that might be interested to fix this in the future.
The problem is that the parameter -to
is passed a timestamp that is greater than the duration of the media file. ffmpeg
handles this by stretching the waveforms for which audio does exist. Therefore, the stretching in the waveform for the last subtitle is greater than the other waveforms, but the bars in the waveform are placed to highlight the segment in the middle.
I found out this by adding some (message ...)
statements in the functions subed-waveform--from-file
and subed-waveform--update-bars
.
$ cd /tmp/subed-repo && git diff
diff --git a/subed/subed-waveform.el b/subed/subed-waveform.el
index abb5a98..7daf670 100644
--- a/subed/subed-waveform.el
+++ b/subed/subed-waveform.el
@@ -269,6 +269,7 @@ and HEIGHT are dimensions in pixels."
"-frames:v" "1"
"-c:v" "png"
"-f" "image2" "-"))))
+ (message "args: %s" args)
(with-temp-buffer
(apply 'call-process subed-waveform-ffmpeg-executable nil t nil args)
(encode-coding-string (buffer-string) 'binary))))
@@ -373,6 +374,8 @@ If POSITION is nil, remove the bar."
stop
(overlay-get overlay 'waveform-start)
(overlay-get overlay 'waveform-stop))))
+ (message "start-pos: %s" start-pos)
+ (message "stop-pos: %s" stop-pos)
(subed-waveform--move-bar :start start-pos overlay)
(subed-waveform--move-bar :stop stop-pos overlay))
(subed-waveform--update-current-bar subed-mpv-playback-position overlay))
#+end_example
The following code block contains the contents of the buffer *Messages*
after opening the file /tmp/subed/a.srt
(see my first message in this issue) and moving the point through all subtitles.
For information about GNU Emacs and the GNU system, type C-h C-a.
a.srt has auto save data; consider M-x recover-this-file
Looping over 00:00:00,000 - 00:00:02,000
Enabled looping over current subtitle
Enabled syncing playback position to point
args: (-accurate_seek -ss 00:00:00.000 -to 00:00:04.000 -i /tmp/subed/a.mp3 -loglevel 0 -filter_complex volume=2.0,showwavespic=s=648x46:colors=gray -frames:v 1 -c:v png -f image2 -)
start-pos: 0.00%
stop-pos: 50.00%
args: (-accurate_seek -ss 00:00:00.000 -to 00:00:04.000 -i /tmp/subed/a.mp3 -loglevel 0 -filter_complex volume=2.0,showwavespic=s=648x46:colors=gray -frames:v 1 -c:v png -f image2 -)
start-pos: 0.00%
stop-pos: 50.00%
Looping over 00:00:00,000 - 00:00:02,000
Waiting for connection...
args: (-accurate_seek -ss 00:00:00.000 -to 00:00:06.000 -i /tmp/subed/a.mp3 -loglevel 0 -filter_complex volume=2.0,showwavespic=s=648x46:colors=gray -frames:v 1 -c:v png -f image2 -)
start-pos: 33.33%
stop-pos: 66.67%
Looping over 00:00:02,000 - 00:00:04,000
args: (-accurate_seek -ss 00:00:02.000 -to 00:00:08.000 -i /tmp/subed/a.mp3 -loglevel 0 -filter_complex volume=2.0,showwavespic=s=648x46:colors=gray -frames:v 1 -c:v png -f image2 -)
start-pos: 33.33%
stop-pos: 66.67%
Looping over 00:00:04,000 - 00:00:06,000
args: (-accurate_seek -ss 00:00:04.000 -to 00:00:10.000 -i /tmp/subed/a.mp3 -loglevel 0 -filter_complex volume=2.0,showwavespic=s=648x46:colors=gray -frames:v 1 -c:v png -f image2 -)
start-pos: 33.33%
stop-pos: 66.67%
Looping over 00:00:06,000 - 00:00:08,000
args: (-accurate_seek -ss 00:00:06.000 -to 00:00:12.000 -i /tmp/subed/a.mp3 -loglevel 0 -filter_complex volume=2.0,showwavespic=s=648x46:colors=gray -frames:v 1 -c:v png -f image2 -)
start-pos: 33.33%
stop-pos: 66.67%
Looping over 00:00:08,000 - 00:00:10,000
Note that in the information shown above Looping over 00:00:08,000 - 00:00:10,000
, the waveform is created for the time 00:00:06.000-00:00:12.000 even though the media file /tmp/subed/a.jpg
is 10 seconds long (see first message to learn how to create this file). Since ffmpeg
tries to build a waveform from 00:00:06.000 to 00:00:12.000, but there are only waves between 00:00:06.000-00:00:10.000, the waves will stretch to fit the dimension showwavespic=s=648x46
. Ideally, ffmpeg
should set start-pos
and stop-pos
to set the bars accordingly when the waveforms have been stretched.
Thinking out loud: we could use ffprobe to get the actual duration of the file, and then we can adjust the stop-ms to consider that duration, possibly adjusting the width of the image as well. We'll probably want to cache that duration so that we're not calculating it all the time, so that would be a good buffer-local variable.
Since I have the appropriate ffprobe command at hand, here it is:
ffprobe -v error -show_entries format=duration -of default=noprint_wrappers=1:nokey=1 video.mkv
That should print the duration in seconds as a float.
@rodrigomorales1 Would it be easy for you to test the https://github.com/sachac/subed/tree/waveform-long-cue branch, or would you like me to merge it into the main branch?
@sachac I'll give that branch a try, write some tests and make a pull request on that branch.
How I reproduced the bug
I created a sample audio file
/tmp/subed/a.mp3
using the command shown below. The audio plays beeps at times 00:01, 00:03, 00:05 and 00:07 and 00:09. The screenshot shown below shows the waveform in Audacity version 2.4.2.I inserted the content of the code block below into the file
/tmp/subed/a.srt
.I cloned the
subed
git repository to make sure that I was using the latest changes in subed.I built GNU Emacs 29.3 from source. For convenience to reproduce the bug, I wrote the following command to start Emacs, visit the file
/tmp/subed/a.srt
and enablesubed-waveform-minor-mode
.When I executed the command, an Emacs window and a
mpv
window were opened and the waveform for the first subtitle (the subtitle the point was on) was shown. See screenshot below.I moved the point to the last subtitle (i.e.
This is subtitle no. 5
).What happened?
The waveform image which was inserted into the buffer showed the waveform for the beep further to the right than its actual position. See screenshot below.
The screenshot shown below shows the buffer after calling
subed-waveform-toggle-show-all
. Note that the waveform for the beep is centered for all subtitles except for the last one.What should have happened?
The waveform for the beep in the last subtitle should be shown at the middle of the image since the beep is approximately played at 00:09 and the subtitle spans from 00:08 to 00:10.
System information