alanngnet / CoverHunterMPS

Fork of Liu Feng's CoverHunter to run on a single computer, plus more features and documentation.
7 stars 2 forks source link

implement coarse-to-fine alignment #1

Open alanngnet opened 6 months ago

alanngnet commented 6 months ago

As described in the CoverHunter research paper at https://ar5iv.labs.arxiv.org/html/2306.09025

Per recent correspondence with Liu Feng, the authors cannot open-source their implementation of coarse-to-fine alignment. He did share this high-level summary (slightly copyedited):

"step1: cut the song as short chunks, step2: use coarse model to compute similarity of every chunk and anchor song. step3: use some algorithm such as greedy search(or others) to get more accurate alignment step4: use the more accurate data to finetune coarse model (maybe training the fine model from scratch is also ok) For example, you have songA and its cover version(as songB), and songA has 120s, songB has 360s. Maybe songB has different intro or bridge or verse (songB only has a segment covered of songA). You can cut songB with 15s, and some segments of songB can be detected as a cover of songA, e.g. 100-200s. So you use songB(100-200s) instead of songB(0-360s), you can get a better model."

alanngnet commented 1 week ago

Demoted to P2 since inference results for Irish traditional dance tunes are performing so well without it.