Open mgaulton opened 2 years ago
i haven't looked at what metadata is available, that might infer the download version. The data on youtube is largely driven off users, so trying to get people to put in quality information is impossible.
the script uses Levenshtein distance to determine the match. https://en.wikipedia.org/wiki/Levenshtein_distance
If someone has uploaded a boot leg rip recorded off a phone and called is "The Beatles - let it be" and its a perfect match by name. Very little we can do with this approach.
Running as a test, I noticed it was grabbing live, instrumental etc copies of songs. Just wondering if it would be possible to somehow filter it so it grabs non-pirate full songs, official videos etc instead. Thanks!