protyposis / AudioAlign

Audio Synchronization and Analysis Tool
GNU Affero General Public License v3.0
137 stars 16 forks source link

[REQ] MPEG transport stream support #15

Closed MarcoRavich closed 5 months ago

MarcoRavich commented 6 months ago

Hi there, we've just tried to sync some files from our - Canon - cams and discovered that MTS files aren't supported yet (AA goes in a deadlock state when a such kind of file is drag&dropped on it).

It would be much better to avoid remuxing (which could generate problems, as well described here) that forces AVCHD users to perform an unnecessary step.

Thanks in advance.

EDIT: of course @justdan96's tsMuxer may help...

protyposis commented 6 months ago

What is your use case and expected outcome? Could you provide a sample file?

MarcoRavich commented 6 months ago

Hi there, well the "use case" is multicam shooting of a live event.

Here are a couple of files from our - Canon - cameras to test the AA functionality: https://mega.nz/folder/NwFRyYzJ#MKWWK2E8Vp8H_RjxBknFXg

Thanks in advance.

protyposis commented 6 months ago

Assuming you synchronized your video files in Audio Align, what's your expected output? Are you aware that AA does not export video files?

MarcoRavich commented 6 months ago

Assuming you synchronized your video files in Audio Align, what's your expected output?

As said, our installation of AA goes in deadlock state when drop an MTS file on it: so syncing is not possible.

Are you aware that AA does not export video files?

We've succesfully synched MP4 (a/v) files and exported to Vegas EDL: it works correctly.

Hope that helps.

protyposis commented 6 months ago

Thanks, so your use case is EDL export, in which case it makes sense to work directly with video files. I'll have to investigate whether MTS support can be added, but I can't give an estimate if and when this is going to happen.

A workaround is extracting the audio from video files (preferably in .wav format), synchronize the audio files, and manually replace the audio files with the video in the exported EDL file.

MarcoRavich commented 6 months ago

Thanks, so your use case is EDL export, in which case it makes sense to work directly with video files. I'll have to investigate whether MTS support can be added, but I can't give an estimate if and when this is going to happen.

It would be great to support it, 'cause it's - still - used by many cameras.

A workaround is extracting the audio from video files (preferably in .wav format), synchronize the audio files, and manually replace the audio files with the video in the exported EDL file.

That's what I've done, of course.

note: does AA perform better syncs using uncompressed audio (.wav) ?

MarcoRavich commented 6 months ago

Digged GH a bit, maybe some of this resources could help:

...anyway note that FFMPEG does support MTS decoding (which are, in many cases, AVC/video + AC3/audio streams).

protyposis commented 6 months ago

Basic TS support has been added in https://github.com/protyposis/AudioAlign/releases/tag/v1.6.0. I validated it with the two sample files you posted above - thanks for providing them. Please let me know if it works for you.

does AA perform better syncs using uncompressed audio (.wav) ?

No, the synchronization results are the same.

Playback works better with uncompressed audio because the FFmpeg access layer for compressed media isn't very stable yet. I recommend to always use .wav files for a stable experience, even though many compressed files work without issues.

MarcoRavich commented 6 months ago

Cool, thanks.

Testing.

note: can you reccomend a "best" algorithm to sync live concerts shooting from different cameras ?

protyposis commented 6 months ago

I usually use the "HK02" algorithm. In general, fingerprinting algorithms are designed to be used with recordings that all contain the same common audio source. This is the case, e.g., in a proscenium stage setup where all cameras record toward a stage with amplified PA audio.

Your case here is special because there is no main audio mix and the cameras were recording from different places and captured quite different audio. Even if they recorded the same instruments, there are differences that fingerprinting methods aren't designed for; e.g., their timings are slightly different (due to the different distances and resulting sound delays). You'll have to experiment, but given the short video durations, the single found match of HK02 could be sufficient for synchronization.

MarcoRavich commented 6 months ago

Your case here is special because there is no main audio mix

Well, of course we have the "mixer audio" recoding too... ...in this regard: how are the different audio formats (bit/frequency rates and codecs) managed ? For example: if it decodes lossy audio tracks through FFMPEG, has the avoid loudness scaling (the -drc_scale 0 parameter) been considered too ? Does it resamples all tracks to the highest one (ex. 96KHz) before fingerprinting ?

Last but not least, we've noticed some overlapping between tracks in the aligning results but we'll detail this (with screenshots) in a separate issue soon.

protyposis commented 6 months ago

Well, of course we have the "mixer audio" recoding too...

I wasn't talking about a separate recording. Ideally, every device records this signal from a common source, e.g. PA speakers - that's when fingerprint sync works best.

how are the different audio formats (bit/frequency rates and codecs) managed ?

Fingerprinting works on low resolution signals (5–8 kHz), so they are all downsampled. Loudness is irrelevant.

protyposis commented 5 months ago

Closing because MTS support has been added.