freemocap / skelly_synchronize

Synchronization tool for videos of the same event. Uses audio cross correlation to synchronize.
https://freemocap.github.io/skelly_synchronize/
GNU Affero General Public License v3.0
11 stars 0 forks source link

Implement more synching options #31

Open philipqueen opened 9 months ago

philipqueen commented 9 months ago

The current synching methods (audio cross correlation and brightness contrast thresholding) both work in the majority of cases. But there are no options for a different method if a user's videos do not synchronize properly (although the brightness contrast method can be run with different thresholds to change the results). Adding methods will help ensure a method works for each set of videos, and different synching methods will be robust to different types of noise. I would like to have at least 2 audio and two brightness methods.

MarcoRavich commented 1 month ago

Hi there, I've collected some interesting open source audio synchronization softwares - yours too of course - for the HyMPS project under the AUDIO category \ Treatments page \ Alignment/Synch and some are really cool.

I strongly encourage you to check Mario Guggenberger's AudioAlign (which relays on Aurio fingerprint library), Ben Miller's Audalign and audio-offset-finder.

I also suggest you try to avoid the "reinventing the wheel" issue by focusing your efforts on aspects (e.g. phase/polarity adjustments after synching) that other projects have not yet considered/implemented.

In other hand, for visual synchronization approach, @cgtuebingen's Learning Robust Video Synchronization without Annotations seems the most advanced/promising one instead.

Hope that inspires.

philipqueen commented 1 month ago

Hi @MarcoRavich,

Thanks for the thoughtful comment, and including us in your resource.

audalign and audio-offset-finder definitely look interesting, and spiritually similar to this project. I'll check them out and see if there's anything that could help make our tool more robust.

As far as a phase polarity post-processing step goes, it's not top on our list, as we're focused on synchronizing videos for use in motion capture, where downstream audio isn't very important. I would gladly review a contribution of it if it's something you're motivated to implement though!

Also, for the Learning Robust Video Synchronization without Annotations, it seems to be a different kind of video synchronization than we're interested in. They perform spatial alignment (align images in space), while we're interested in temporal alignment (align frames by when they were shot)