In order to improve search performance we need an automated evaluation suite. I made a first attempt at this in eval.py, where I can manually add hard test cases to debug.
A more systematic, larger, test set is needed to count false negatives and positives.
Maybe can be obtained from the musicbrainz database.
In order to improve search performance we need an automated evaluation suite. I made a first attempt at this in eval.py, where I can manually add hard test cases to debug.
A more systematic, larger, test set is needed to count false negatives and positives. Maybe can be obtained from the musicbrainz database.