Closed alanngnet closed 6 months ago
so far I have confirmed that original CoverHunter provided almost nothing to support true inference beyond generating (useful) embeddings for each training sample. So this issue probably needs chunking into a series of enhancement issues. 2 possible approaches: build a way to generate class centroid embeddings (one embedding per distinct song_id) or modify the training model to add a layer that has class inference built into the final layer, so the self-contained model could be used to directly output inferred song_ids. Probably leaning towards the former, so that we can then do ranking of nearest neighbors and use that in various creative ways for real-world applications.
Other implementation ideas, anyone?
building an MVP using just direct comparison of query embedding with the entire set of reference embeddings, adapting from the distance-matrix computation approach taken in eval_testset.py.
Calling this MVP good enough for now, especially since we have higher priorities for a while.
Build a script to use the trained model to generate the top N closest matches to a given CQT feature array.