protyposis / AudioAlign

Audio Synchronization and Analysis Tool
GNU Affero General Public License v3.0
137 stars 16 forks source link

Stretching and shrinking with tuning fluctuations. #13

Open Cello23 opened 6 months ago

Cello23 commented 6 months ago

I am very impressed with the effort and work that has been put into this application. Bravo! I have only one problem: After the analysis to match the chosen audio with the original, of course some stretching and shrinking was involved, however there are tuning fluctuations in those places, which makes the final result unusable. Is this something that hasn't been resolved yet?

protyposis commented 6 months ago

Thank you for your kind words!

This issue indeed hasn't been addressed yet. The currently implemented method of time stretching through resampling was primarily designed for recording drift removal, where pitch changes due to drift and resampling typically balance each other out.

However, for timing alignment scenarios like the one you've described (if I've understood your use case correctly?), this method is unsuitable for the reason you mentioned. It would require implementing a "true" time stretching algorithm, which is a complex topic (there are various methods available, each having its pros and cons depending on the sound type, as detailed in the Wikipedia article). Unfortunately, integrating such an algorithm isn't currently planned, but I'll keep this issue open because I think it would be a valuable addition.

For future reference, a potential candidate for integration could be SoundTouch or its C# reimplementation. It is unclear, though, whether it is good enough to be a one-fits-all solution (for music, speech, etc.).

Cello23 commented 6 months ago

Very kind of you for answering me. Perhaps what I have proposed is not very common in normal use, but I can assure you that in the field of classical music, a tool like this one with a true time stretching algorithm would be highly required. Speaking of classical music, you can imagine that any live concert, where it's normal to have some kind of "accidents" (a wrong note or some out of tune passage), part of the original audio could be replaced by a previous audio take (general rehearsal, for example). Normally this is done manually and if the passage is not that big (a couple of bars), time stretching would not be required, but if something larger has to be replaced, this tool (with time stretching algorithm) would save an enormous amount of time. So, keep in mind it would be worth implementing something like that. There is a Lua script for Reaper "Align Takes" which uses alignment and time stretching. It does the job pretty good. Thanks a lot for keeping the issue open! Cheers!

protyposis commented 6 months ago

Thanks for explaining your use case, that is very helpful.

Btw., I took a quick look at Reaper, and it's interesting that it implements multiple time-stretching algorithms and lets the user decide, probably due to the none-fits-all aspect I mentioned before. One of the options is indeed SoundTouch, and it also integrates another open-source library, Rubber Band.