freemocap / skelly_synchronize

Synchronization tool for videos of the same event. Uses audio cross correlation to synchronize.
https://freemocap.github.io/skelly_synchronize/
GNU Affero General Public License v3.0
11 stars 0 forks source link

Implement more synching options #31

Open philipqueen opened 11 months ago

philipqueen commented 11 months ago

The current synching methods (audio cross correlation and brightness contrast thresholding) both work in the majority of cases. But there are no options for a different method if a user's videos do not synchronize properly (although the brightness contrast method can be run with different thresholds to change the results). Adding methods will help ensure a method works for each set of videos, and different synching methods will be robust to different types of noise. I would like to have at least 2 audio and two brightness methods.

MarcoRavich commented 3 months ago

Hi there, I've collected some interesting open source audio synchronization softwares - yours too of course - for the HyMPS project under the AUDIO category \ Treatments page \ Alignment/Synch and some are really cool.

I strongly encourage you to check Mario Guggenberger's AudioAlign (which relays on Aurio fingerprint library), Ben Miller's Audalign and audio-offset-finder.

I also suggest you try to avoid the "reinventing the wheel" issue by focusing your efforts on aspects (e.g. phase/polarity adjustments after synching) that other projects have not yet considered/implemented.

In other hand, for visual synchronization approach, @cgtuebingen's Learning Robust Video Synchronization without Annotations seems the most advanced/promising one instead.

Hope that inspires.

philipqueen commented 3 months ago

Hi @MarcoRavich,

Thanks for the thoughtful comment, and including us in your resource.

audalign and audio-offset-finder definitely look interesting, and spiritually similar to this project. I'll check them out and see if there's anything that could help make our tool more robust.

As far as a phase polarity post-processing step goes, it's not top on our list, as we're focused on synchronizing videos for use in motion capture, where downstream audio isn't very important. I would gladly review a contribution of it if it's something you're motivated to implement though!

Also, for the Learning Robust Video Synchronization without Annotations, it seems to be a different kind of video synchronization than we're interested in. They perform spatial alignment (align images in space), while we're interested in temporal alignment (align frames by when they were shot)