Closed emkor closed 6 years ago
It would be a lot easier if songs would have metadata - without them there will be a lot more work to obtain informations we need. Some of the websites provide an official api (Discogs, last.fm, Spotify, MusicBrainz), there is also (unofficial?) wikipedia api that may come in handy. In worst case scenario I think it won't be a problem to manually scrape them. I used only Beatuiful Soap for this kind of tasks so far, but I'll do research later if it's the most efficient tool. When it comes to genres - please remember these often vary, so it would be a good idea to exclude things like 'real recognize real and this nigga the realest' - maybe create list of accepted genres? (Storing it in external file seems little odd for me).
I think that we should take care of filtering genres and so on later, when we have them ready in db. And yeah, audiopyle will extract features from mp3 files, so I hope most of those tracks will contain artist and track title in its meta - otherwise scraping data would be impossible. This issue is more like a research task - find out public apis et cetera
Abandoned
Scraper is the next app for audiopyle system. Base idea is to download song information (like album, artist etc., maybe tags?) from internet sources, like MusicBrainz, Last.fm, Spotify etc and store it in specified format. It would be useful for ML and for feature analysis.