mifi / lossless-cut

The swiss army knife of lossless video/audio editing
https://losslesscut.app/
GNU General Public License v2.0
28.09k stars 1.36k forks source link

Audio Waveforms Missing in Davinci Resolve #1651

Open Checkmarks opened 1 year ago

Checkmarks commented 1 year ago

I have a lot of issues to go through, so in order to make it easier for me to help you, I ask that you please try these things first

Operating System

Windows 11

Steps to reproduce

When trimming a clip in LosslessCut and then importing that clip into Davinci Resolve, sometimes the Audio Waveforms do not show up from the start of the clip in Davinci Resolve. The audio can be heard regardless, but the waveforms do not show up for some reason.

In my troubleshooting it seems that the Audio Waveforms only show up fully in Davinci Resolve when the start segment of the current time is divisible by 8 seconds. The end segment does not affect results only the start segment. If the start segment is not divisible by 8 seconds, then in Davinci Resolve, it will not show the audio waveforms until the 20 second mark in the video. This means that if the clip duration is 20 seconds or less and the start segment is not divisible by 8 then there will be no audio waveforms shown at all in Davinci Resolve.

Expected behavior

Audio Waveforms should show up in Davinci Resolve regardless of the start segment timestamp.

Actual behavior

Audio Waveforms only show up fully in Davinci Resolve if the start segment timestamp is divisible by 8 seconds. If the start segment timestamp is not divisible by 8 seconds, then the Audio Waveforms only show up in Davinci Resolve after the 20 second mark.

Share log

Start Segment Divisible by 8 seconds:

index-2c6483e9.js:307 outSegTemplateOrDefault ${FILENAME}-${CUT_FROM}-${CUT_TO}${SEG_SUFFIX}${EXT}
index-2c6483e9.js:176 customTagsByFile {}
index-2c6483e9.js:176 customTagsByStreamId {}
index-2c6483e9.js:176 Cutting from 520 to end
index-2c6483e9.js:169 ffmpeg -hide_banner -ss '520.00000' -i 'C:\Captures\ExampleClip.mkv' -avoid_negative_ts make_zero -map '0:0' '-c:0' copy -map '0:1' '-c:1' copy -map '0:2' '-c:2' copy -map '0:3' '-c:3' copy -map '0:4' '-c:4' copy -map '0:5' '-c:5' copy -map_metadata 0 -movflags '+faststart' -default_mode infer_no_subs -ignore_unknown -f matroska -y 'C:\Captures\ExampleClip-00.08.40.000-00.09.17.467.mkv'
index-2c6483e9.js:169 STDERR:
index-2c6483e9.js:169 [libaom-av1 @ 000001ee5c3102c0] 3.5.0-242-g6ed0c7a32
Guessed Channel Layout for Input Stream #0.1 : stereo
Guessed Channel Layout for Input Stream #0.2 : stereo
Guessed Channel Layout for Input Stream #0.3 : stereo
Guessed Channel Layout for Input Stream #0.4 : stereo
Guessed Channel Layout for Input Stream #0.5 : stereo
Input #0, matroska,webm, from 'C:\Captures\ExampleClip.mkv':
  Metadata:
    ENCODER         : Lavf60.3.100
  Duration: 00:09:17.47, start: 0.000000, bitrate: 61582 kb/s
  Stream #0:0: Video: av1 (Main), yuv420p(tv, bt709), 2560x1440 [SAR 1:1 DAR 16:9], 60 fps, 60 tbr, 1k tbn
    Metadata:
      DURATION        : 00:09:17.467000000
  Stream #0:1: Audio: pcm_f32le, 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : All (Exluding Desktop Audio)
      DURATION        : 00:09:17.461000000
  Stream #0:2: Audio: pcm_f32le, 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Game Audio
      DURATION        : 00:09:17.440000000
  Stream #0:3: Audio: pcm_f32le, 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Microphone Input Audio
      DURATION        : 00:09:17.440000000
  Stream #0:4: Audio: pcm_f32le, 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Voice Chat Audio
      DURATION        : 00:09:17.440000000
  Stream #0:5: Audio: pcm_f32le, 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Desktop Audio
      DURATION        : 00:09:17.440000000
Output #0, matroska, to 'C:\Captures\ExampleClip-00.08.40.000-00.09.17.467.mkv':
  Metadata:
    encoder         : Lavf59.27.100
  Stream #0:0: Video: av1 (Main) (AV01 / 0x31305641), yuv420p(tv, bt709), 2560x1440 [SAR 1:1 DAR 16:9], q=2-31, 60 fps, 60 tbr, 1k tbn
    Metadata:
      DURATION        : 00:09:17.467000000
  Stream #0:1: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s (default)
    Metadata:
      title           : All (Exluding Desktop Audio)
      DURATION        : 00:09:17.461000000
  Stream #0:2: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Game Audio
      DURATION        : 00:09:17.440000000
  Stream #0:3: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Microphone Input Audio
      DURATION        : 00:09:17.440000000
  Stream #0:4: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Voice Chat Audio
      DURATION        : 00:09:17.440000000
  Stream #0:5: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Desktop Audio
      DURATION        : 00:09:17.440000000
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
  Stream #0:1 -> #0:1 (copy)
  Stream #0:2 -> #0:2 (copy)
  Stream #0:3 -> #0:3 (copy)
  Stream #0:4 -> #0:4 (copy)
  Stream #0:5 -> #0:5 (copy)
Press [q] to stop, [?] for help
frame=    1 fps=0.0 q=-1.0 size=       1kB time=00:00:00.00 bitrate=10120.0kbits/s speed=N/A    
frame= 2248 fps=0.0 q=-1.0 Lsize=  180725kB time=00:00:37.46 bitrate=39521.2kbits/s speed= 166x    
video:110438kB audio:70208kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.044240%

Start Segment Not Divisible by 8 seconds:

index-2c6483e9.js:307 outSegTemplateOrDefault ${FILENAME}-${CUT_FROM}-${CUT_TO}${SEG_SUFFIX}${EXT}
index-2c6483e9.js:176 customTagsByFile {}
index-2c6483e9.js:176 customTagsByStreamId {}
index-2c6483e9.js:176 Cutting from 518 to end
index-2c6483e9.js:169 ffmpeg -hide_banner -ss '518.00000' -i 'C:\Captures\ExampleClip.mkv' -avoid_negative_ts make_zero -map '0:0' '-c:0' copy -map '0:1' '-c:1' copy -map '0:2' '-c:2' copy -map '0:3' '-c:3' copy -map '0:4' '-c:4' copy -map '0:5' '-c:5' copy -map_metadata 0 -movflags '+faststart' -default_mode infer_no_subs -ignore_unknown -f matroska -y 'C:\Captures\ExampleClip-00.08.38.000-00.09.17.467.mkv'
index-2c6483e9.js:169 STDERR:
index-2c6483e9.js:169 [libaom-av1 @ 000001f55ecd02c0] 3.5.0-242-g6ed0c7a32
Guessed Channel Layout for Input Stream #0.1 : stereo
Guessed Channel Layout for Input Stream #0.2 : stereo
Guessed Channel Layout for Input Stream #0.3 : stereo
Guessed Channel Layout for Input Stream #0.4 : stereo
Guessed Channel Layout for Input Stream #0.5 : stereo
Input #0, matroska,webm, from 'C:\Captures\ExampleClip.mkv':
  Metadata:
    ENCODER         : Lavf60.3.100
  Duration: 00:09:17.47, start: 0.000000, bitrate: 61582 kb/s
  Stream #0:0: Video: av1 (Main), yuv420p(tv, bt709), 2560x1440 [SAR 1:1 DAR 16:9], 60 fps, 60 tbr, 1k tbn
    Metadata:
      DURATION        : 00:09:17.467000000
  Stream #0:1: Audio: pcm_f32le, 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : All (Exluding Desktop Audio)
      DURATION        : 00:09:17.461000000
  Stream #0:2: Audio: pcm_f32le, 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Game Audio
      DURATION        : 00:09:17.440000000
  Stream #0:3: Audio: pcm_f32le, 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Microphone Input Audio
      DURATION        : 00:09:17.440000000
  Stream #0:4: Audio: pcm_f32le, 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Voice Chat Audio
      DURATION        : 00:09:17.440000000
  Stream #0:5: Audio: pcm_f32le, 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Desktop Audio
      DURATION        : 00:09:17.440000000
Output #0, matroska, to 'C:\Captures\ExampleClip-00.08.38.000-00.09.17.467.mkv':
  Metadata:
    encoder         : Lavf59.27.100
  Stream #0:0: Video: av1 (Main) (AV01 / 0x31305641), yuv420p(tv, bt709), 2560x1440 [SAR 1:1 DAR 16:9], q=2-31, 60 fps, 60 tbr, 1k tbn
    Metadata:
      DURATION        : 00:09:17.467000000
  Stream #0:1: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s (default)
    Metadata:
      title           : All (Exluding Desktop Audio)
      DURATION        : 00:09:17.461000000
  Stream #0:2: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Game Audio
      DURATION        : 00:09:17.440000000
  Stream #0:3: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Microphone Input Audio
      DURATION        : 00:09:17.440000000
  Stream #0:4: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Voice Chat Audio
      DURATION        : 00:09:17.440000000
  Stream #0:5: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 48000 Hz, stereo, flt, 3072 kb/s
    Metadata:
      title           : Desktop Audio
      DURATION        : 00:09:17.440000000
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
  Stream #0:1 -> #0:1 (copy)
  Stream #0:2 -> #0:2 (copy)
  Stream #0:3 -> #0:3 (copy)
  Stream #0:4 -> #0:4 (copy)
  Stream #0:5 -> #0:5 (copy)
Press [q] to stop, [?] for help
frame=    1 fps=0.0 q=-1.0 size=       1kB time=00:00:00.00 bitrate=10120.0kbits/s speed=N/A    
frame= 2368 fps=0.0 q=-1.0 Lsize=  190486kB time=00:00:39.46 bitrate=39544.3kbits/s speed= 165x    
video:116474kB audio:73928kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.044164%
mifi commented 1 year ago

did you try these steps? https://github.com/mifi/lossless-cut/blob/master/issues.md#the-exported-video-has-a-problem does it make any difference?

Checkmarks commented 1 year ago

did you try these steps? https://github.com/mifi/lossless-cut/blob/master/issues.md#the-exported-video-has-a-problem does it make any difference?

Yes, absolutely I tried them and it does not make a difference. It's a really interesting issue, and it's one that I can work around by making sure that the start segment is divisible by 8 seconds but it's still so bizarre and inconvenient.

mifi commented 1 year ago

making sure that the start segment is divisible by 8 seconds

do you mean "start" time of every segment is exactly divisible by 8 seconds? e.g.:

does your video by any chance have keyframes exactly every 8 seconds?

Checkmarks commented 1 year ago

making sure that the start segment is divisible by 8 seconds

do you mean "start" time of every segment is exactly divisible by 8 seconds? e.g.:

  • segment1 start = 8
  • segment2 start = 16
  • segment3 start = 24
  • etc?

does your video by any chance have keyframes exactly every 8 seconds?

The video has a keyframe interval of exactly 2 seconds. What I mean by the "start segment" is the pointer for "Start current segment at current time". This is the feature in LosslessCut application that allows you to select the start time and end time of the clip that you want to capture. When I seek to the next or previous keyframe it jumps exactly 2 seconds in either direction so that is working good. The issue is that when I export with the start segment not divisible by 8 seconds, the audio waveforms do not show up in Davinci Resolve.

So for example, these are the reproduction steps:

  1. Open video file in LosslessCut. The video I'm opening has a total duration of 8:58.794.
  2. Let's say I want to create a clip of this footage, and that the clip I want to take is in the middle of the video.
  3. I use the sliders to navigate to the start position where I want to start my new clip. This puts me at the video segment of 00:04:37:.159.
  4. I then press the left KEY icon to seek to the previous keyframe. This puts me at the video segment of 00:04:36.000.
  5. I then press the left POINT icon to mark the start segment's current time (00:04:36.000).
  6. I use the sliders to navigate to the end position where I want to end my new clip. This puts me at a video segment of 00:06:18.632.
  7. I then press the right KEY icon to seek to the next keyframe. This puts me at the video segment of 00:06:20.000.
  8. I then press the right POINT icon to mark the end segment's current time (00:06:20.000).
  9. In the top right of the application, I can see that I have the following Segments to export: 00:04.36.000 - 00:06:20.000 with a duration of 1:44, 104000ms, 6240 frames.
  10. Pressing export at this time, and then opening the exported file in Davinci Resolve will not show the Waveforms in the editor since the start segment's time (00:04:36.000) is not divisible by 8. We know this because 00:04:36.000 translates to (4 * 60) + 36 = 276. Now, if we divide 276 by 8, we get 34.5.
  11. To allow the Waveforms in the Davinci Resolve editor to show up we have to select the same Segments to export (00:04.36.000 - 00:06:20.000) and then press the left KEY icon to go back to the previous frame from 00:04.36.000. This puts us at 00:04:34.000. Some quick math on this number translates to (4 60) + 34 = 274. Now, if we divide 274 by 8, we get 34.25. Still no good, and if I were to export this and import into Davinci Resolve editor, I would experience the same missing waveforms issue. So, we press the left KEY icon again to go to the previous keyframe. This takes us to 00:04:32.000. Some math on this shows us that we are at (4 60) + 32 = 272. Now, if we divide 272 by 8, we get 34! Since this number is perfectly divisible by 8, we can now mark this position and export the clip and import it into Davinci Resolve. The waveforms will show up and everything is working as expected.

I hope this clears things up. Please let me know if you need more info or have additional questions!

mifi commented 1 year ago

That's very interesting. I'm wondering why the number 8. (other than being the most desirable in many asian cultures 😄). TBH I have really no idea what's causing this. Probably only a select few developers at Davinci's dev team knows. But it's an interesting observation anyways to correlate with future similar problems.

Checkmarks commented 1 year ago

So you believe that this is a Davinci issue and not a lossless cut issue? I'm thinking that it's a lossless cut issue that is corrupting the waveforms.

mifi commented 1 year ago

I think it's probably a combination, since the audio is still playable. It could be that losslesscut/ffmpeg produces a file with the audio coded in a particular way that triggers a bug/condition in Davinci that causes waveforms to disappear although the audio is still clearly playable.