demozoo / demozoo

the best demoscene website in the world.
107 stars 28 forks source link

Metadata for Music #52

Open emoon opened 10 years ago

emoon commented 10 years ago

Something that would be really nice would be to have metadata from music that applications can use to request data. For example a musicplayer can calculate a hash (sha1, md5, etc) and request the info from demozoo about the music. This would be esp good for modules and similar formats as they need to be replayed differently if created in Protracker 2.3, 3.x, etc. Also the metadata can contain info like which demo the music was used in.

glennlunder commented 10 years ago

This sounds like two different issues; metadata would need to be recorded in the database, while things like what production something was used in (or what party it was released at, its rank, its release date etc:) is data we already have, and can be accessed via the api (when that is done).

emoon commented 10 years ago

Yes, The API issue is a different thing but I guess (hope) you would be able to to access thing kind of data. I would say the extraction of data (the metadata) is something that could be done server side similar to to how graphics extraction works today but selecting the correct PT version is something that has to be done manually

asbjornu commented 4 years ago

While I like the idea, I'm not sure doing a raw hash of the song will be of much use, as it would be hard to normalize the input into the hash to be equal everywhere a given song is played. Something like when the song was last saved might be metadata embedded into the track in some formats, so we might do better by extracting a fingerprint than doing a full hash. Perhaps Last.fm's Fingerprinter (open source) might be of use?

sagamusix commented 4 years ago

File hashes work incredibly well for most use cases described here and have been used by various demoscene archives in the past to match their data. An audio fingerprinter on the other hand will probably not work, as this is a chicken-egg problem: You want to obtain certain metadata how to play the song, but for creating a fingerprint you need to play the song first! So if you play the song with the wrong initial settings, its fingerprint might not be even close to the one stored in the database (this would for example happen with different tempo interpretations in MOD files, which might be one field of the metadata). On the other hand, as long as people don't create 1000 variants of the same MOD file (hello mod.echoing), the file hash will be very stable and every client will be able to request the information without any heavy computations. Demoscene music is not the same as the commercial world where you might find 100 different CD rips, which in return are all different from the file obtained from 100 different MP3 online stores, typically there is only one or very few variants of the same song floating around.

I did some fingerprinting attempts with open-source fingerprinting libraries in the past and the results weren't too good.

sagamusix commented 4 years ago

Also, Last.fm's fingerprinter is GPLed, which paints all potential users of that API into the copyleft corner, unless someone reimplements the same hashing method under a more permissive license. If you really want to use a fingerprinting library, ChromaPrint might be a better choice as it's MIT-licensed.