cosmir / openmic-annotator

Annotation framework for annotating data for OpenMIC
MIT License
56 stars 1 forks source link

Instrument class taxonomy - Draft #2

Closed ejhumphrey closed 7 years ago

ejhumphrey commented 8 years ago

we need to define a space of instruments that will be known / predicted.

medleydb has one: http://medleydb.weebly.com/description.html#metadata

others to fold in?

bmcfee commented 8 years ago

Here's the wordnet slice for musical instruments.

stefan-balke commented 8 years ago

Here's the wordnet slice for musical instruments.

Looks quite complete to me. Maybe a little bit too complete? However, I guess the "base-taxonomy" can be rather big–the task itself can then deal with a subset.

Btw: Wordnet is written in Prolog so it might be easier to directly parse the HTML output :)

stefan-balke commented 8 years ago

RWC has some: https://staff.aist.go.jp/m.goto/RWC-MDB/rwc-mdb-i.html

IMSLP (unstructured): http://imslp.org/wiki/IMSLP:Abbreviations_for_Instruments

Let's see what the UPF guys use in the Jamendo corpus.

ffont commented 8 years ago

As part of the AudioCommons project in which UPF and QMUL collaborate with Jamendo, we are planning to at least partially define an instruments taxonomy for instrument classification. This is also what we need here so I guess it would make a lot of sense to combine efforts. Our idea is to concentrate on the taxonomy that Jamendo is using, which you can see here: https://licensing.jamendo.com/en/catalog (go down and click on the instruments tab).

The hierarchy only contains 7 main groups and then a relatively large number of instruments in each group. As I said before, we still have not discussed much about which ones to include or whether we should further expand the hierarchy into more levels. However I'll be happy to share things between Omec and AudioCommons and I'll also try to get AudioCommons people responsible for the taxonomy involved in this conversation.

To be honest, I would favour some kind of taxonomy derived from existing data (or existing and working initiatives such as Jamendo) rather than some generic instruments taxonomy that can become really complex but maybe less meaningful from an information retrieval point of view.

ejhumphrey commented 8 years ago

Agreed, the Jamendo taxonomy is probably a good place to start ... 2.5 of the 7 are a little odd for my tastes: Popular and Other are catch-alls, and Electronic is going to be messy. If ImageNet can accomplish this for 1000+ objects, I'm sure we can sort out a few dozen instruments.

What I'd really like to do is identify a few folks to take full ownership of this and make a proposal. There's enough work to go around, and this is largely independent of everything else that needs to be accomplished mechanically. The deliverable of this issue will be something like a JSON schema / namespace, that can be dynamically accessed (as a web resource) and used to populate the possible tag space of an annotation UI. This might be nicely achieved as a JAMS namespace, but maybe @bmcfee has opinions on this.

I'll be spawning other issues / TODOs, but folks should jump on this one if it interests them.

jordipons commented 8 years ago

Maybe you want to take a look at: http://isophonics.org/content/musical-instrument-taxonomies

ImageNet is great because it is based on WordNet - that is a "semantic space", an ontology. Quoting the ImageNet web site: "Each meaningful CONCEPT in WordNet, possibly described by multiple words or word phrases, is called a 'synonym set' or 'synset'". Maybe is a good idea to use one of these ontologies, similarly as ImageNet uses WordNet as a backbone. As far as I know, musical concepts are not well represented in WordNet. This is why I propose taking a look to the link above.

I agree that the Jamendo taxonomy is probably a good place to start, specially because it is the most easy option (audio and taxonomy are already available). However, using a complete and well structured ontology as a backbone might be a bigger step for a "sustainable MIR evaluation" since the annotation efforts are oriented towards fully solving the task. These ontologies are supposed to completely describe the "instruments space" and probably are more complete than a regular taxonomy.

Please note that this comment contradicts Frederic position :). Maybe a compromise between taxonomy derived from existing data vs. complex instruments taxonomy should be found. We can deal with this situation by simply annotating the data in batches, where the first release could be based in existing data from Jamendo - of course, tied to a complex instruments taxonomy. That is, if I understood well from your paper, what you pretend by proposing an incremental evaluation: find an easy start, and move on towards fully solving the task.

And most importantly, I think that now is time for discussing which reference complex instruments taxonomy we want to have as MIR community. Otherwise we are wasting our annotation efforts and we cannot compare results!

As a final remark, and to motivate you to use these more complete taxonomies (ontologies, semantic spaces): some researchers in the computer vision field claim to have bridged the semantic gap. Why? Because the results in the ImageNet Challenge show that researchers are now able to predict concepts (WordNet semantic space) from raw data (pixels of an image). Wouldn't it be great to move towards that direction?

dpwe commented 8 years ago

I'm interested in this problem and would be happy to take responsibility.

I'm involved in developing an ontology for sound events. Currently it has ~600 nodes, some 120 of which are musical instruments; the structure is pretty shallow, and the coverage is not trying to be definitive for musical purposes, but it's supposed to span the distinctions that a "typical" listener can easily make. I'm not proposing this as a solution, but it does mean I have some experience with the difficulties of creating this kind of thing.

I'm interested in broad input here. My impression is that there's a tendency to get excessively technical and fine-grained. Wikipedia leans on Hornbostel–Sachs https://en.wikipedia.org/wiki/Hornbostel–Sachs, which I guess is the favored scheme for ethnomusicologists, but which strikes me as a bit pedantic.

My ideal would be to have a hierarchy with a few dozen readily-distinguishable instruments that could be quickly and reliably judged by non-expert listeners, then to provide for finer distinctions within those classes to cover cases where more detailed labeling is important.

Of course, the idea of the "typical" listener is hugely problematic, since it's so dependent on experience. I personally think it's reasonable to expect people to distinguish a tenor and soprano sax, but I would have a lot of trouble naming any instruments from other cultures like China, India, etc. I suspect the solution might be to give up on universality and instead go for multiple parallel hierarchies specialized for particular music styles.

DAn.

On Tue, Aug 30, 2016 at 2:02 AM, jordipons notifications@github.com wrote:

Maybe you want to take a look at: http://isophonics.org/content/ musical-instrument-taxonomies

ImageNet is great because it is based on WordNet - that is a "semantic space", an ontology. Quoting the ImageNet web site: "Each meaningful CONCEPT in WordNet, possibly described by multiple words or word phrases, is called a 'synonym set' or 'synset'". Maybe is a good idea to use one of these ontologies, similarly as ImageNet uses WordNet as a backbone. As far as I know, musical concepts are not well represented in WordNet. This is why I propose taking a look to the link above.

I agree that the Jamendo taxonomy is probably a good place to start, specially because it is the most easy option (audio and taxonomy are already available). However, using a complete and well structured ontology as a backbone might be a bigger step for a "sustainable MIR evaluation" since the annotation efforts are oriented towards fully solving the task. These ontologies are supposed to completely describe the "instruments space" and probably are more complete than a regular taxonomy.

Please note that this comment contradicts Frederic position :). Maybe a compromise between taxonomy derived from existing data vs. complex instruments taxonomy should be found. We can deal with this situation by simply annotating the data in batches, where the first release could be based in existing data from Jamendo - of course, tied to a complex instruments taxonomy. That is, if I understood well from your paper, what you pretend by proposing an incremental evaluation: find an easy start, and move on towards fully solving the task.

And most importantly, I think that now is time for discussing which reference complex instruments taxonomy we want to have as MIR community. Otherwise we are wasting our annotation efforts and we cannot compare results!

As a final remark, and to motivate you to use these more complete taxonomies (ontologies, semantic spaces): some researchers in the computer vision field claim to have bridged the semantic gap. Why? Because the results in the ImageNet Challenge show that researchers are now able to predict concepts (WordNet semantic space) from raw data (pixels of an image). Wouldn't it be great to move towards that direction?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/omec/bedevere/issues/2#issuecomment-243341314, or mute the thread https://github.com/notifications/unsubscribe-auth/AAhs0WuY5lMuGnsLEYpHvPbnNj39oDhmks5qk8dggaJpZM4JgYwm .

ejhumphrey commented 8 years ago

Awesome! thanks for volunteering. :o)

I agree, the HS categorization, while right, is like animal species for wordnet -- a bit overkill for our "version zero" purposes, I think.

Perhaps the taxonomy discussion is a bit easier if we peer a little bit down the road to where future iterations could go. One possible progression:

If this is the general roadmap at 30k feet, then our immediate goals are something like

In this formulation, it'd be sufficient to draw that line at, say, "the top 100 instrument labels in the Jamendo collection" (less those like "synthesizer" or "computer" because obviously). Then, we can define a mapping into our more general, scalable taxonomy, that can be expanded as we move beyond this collection and these instruments.

For my part, if we can, as a community, solve 100-class instrument recognition over web scale collections, that would be flat out amazing. And then we can tackle the next 100 or 1000 instruments once proving to ourselves we can get that far. I think an overarching goal of this project is going to aim for the simplest thing that gets us to a meaningful result that we can iterate on.

thoughts?

ejhumphrey commented 8 years ago

forgot to mention:

My ideal would be to have a hierarchy with a few dozen readily-distinguishable instruments that could be quickly and reliably judged by non-expert listeners, then to provide for finer distinctions within those classes to cover cases where more detailed labeling is important.

+1000

julian-urbano commented 8 years ago

My ideal would be to have a hierarchy with a few dozen readily-distinguishable instruments that could be quickly and reliably judged by non-expert listeners, then to provide for finer distinctions within those classes to cover cases where more detailed labeling is important.

+1 as well, but I think we must make an effort first to define the task, even if our definition is vague, wrong or incomplete. I have the feeling that some decisions in terms of instrument taxonomy will have to do with it, such as level of complexity. Or maybe just the kind of annotations we gather, I don't know.

For now, I also agree that we should probably focus on things that the average person could identify, even if that means that our dataset is initially biased towards Western music. The same goes with the initial set of instruments; I agree with @dpwe.

In any case, given that defining the instrument taxonomy will not be a straightforward task, should we maybe start with task definition, use case and data?

ejhumphrey commented 8 years ago

One possible response:

alastair commented 8 years ago

One extra source of instruments is MusicBrainz: http://musicbrainz.org/instruments

The criteria for inclusion here is "there exist at least 2 recordings in the MusicBrainz database which use this instrument". This means that every instrument on the list is related to at least a few recordings in MB. It shouldn't be difficult to get a rank of most popular instruments, which could help to choose the most common ones in an initial list. This could also serve as a ground-truth for confirming annotations (although the annotations in MB are likely from metadata [cd listings] instead of user-listening)

dmcennis commented 8 years ago

May I add, love the discussion, but can we add to this proposal 'here is the database back-end features inside annotation storage we require to implement this'? Can we then attach it to #5 ?

dmcennis commented 8 years ago

As a start, will this require a meta-table describing instrument groups? Do we need search capability on audio linked metadata to find all annotations of particular instruments?

dmcennis commented 8 years ago

Are there any additional data needed in separate tables? Do they require references to annotators? genre or other metadata linked to specific audio? What features does the javascript annotator need to have to support this task?

ejhumphrey commented 8 years ago

been getting some great feedback while taking this show on the road -- one solid suggestion I wanted to repeat here is to aim for tractability / minimal ambiguity, and thus improving our chances of a successful experiment, i.e. keep it simple.

A ballpark of ≈25 instruments in year one is probably a good goal, prevents over-reaching, and might make it easier to grow the taxonomy into hierarchies later (guitar -> [guitar-acoustic-nylon, guitar-electric-clean, ...], voice -> [voice-male, voice-female, voice-choir], and so on).

I'm curious to dig into the Jamendo collection and see what track instrument tag occurrence counts look like. Might help steer things a little bit. Thoughts?

ffont commented 8 years ago

Hi, I'll extract the Jamendo instruments taxonomy from their site and also gather a list of instrument annotations from the metadata. I'll upload data here so we can have a look. I guess I'll have collected all the data between today and monday.

ejhumphrey commented 8 years ago

coooool, that'd be great! Specifically I was planning on trying to compile a JSON dump like

{
  "track_id_0000": [ "instrument_A", "instrument_D", ...],
  "track_id_0001": [ "instrument_C" ],
  ... 
}

so that we could (a) sample track IDs based on instrument occurrence, (b) take a look at the trackwise co-occurrence matrix, and (c) look at an overall instrument distribution. Is that what you were thinking or something better / different?

ffont commented 8 years ago

Sooner than expected (how many times does this ever happen?), I could collect the data and I just uploaded it to the repo:

(please feel free to move files around in other folders, I did not know where to place them so I created this structure hoping that other people will do similar things for their taxonomies/data and will upload it in sibling folders)

The instruments taxonomy has been manually gathered from Jamendo Licensing explore catalog interface.

Please remember that we're just computing this from a ~20k tracks set from Jamendo for which we already have data (including audio in flac format). There is more content in Jamendo that we can probably also use.

For your enjoyment, here are the 25th most used instrument annotations in the Jamendo data we have:

instrument count
drum 5762
voice 4921
piano 4102
synthesizer 2121
bass 2061
guitar 1769
electricguitar 1631
keyboard 1156
computer 1096
acousticguitar 727
strings 577
saxophone 360
cello 224
trumpet 196
violin 156
sampler 154
drummachine 154
flute 145
classicalguitar 138
electricpiano 129
accordion 123
harp 85
percussion 80
orchestra 78
ukulele 71
alastair commented 8 years ago

Here's a similar table to frederic's, with the number of instrument annotations in MusicBrainz.

Note that this only finds annotations present on recordings ("Person x played instrument y on recording z"). This relation also exists at an Album level. Voice/vocals isn't included as it's a separate relation type. Some instruments are children of instruments that also exist in the list (saxophone/alto, guitar/bass/electric/acoustic/etc). Some relationships contain additional attributes, "solo", "additional", "guest", "background", which I haven't included.

Is it useful for this to get more data from this database? Any ideas about what to do next?

SELECT lat.name, count(*) 
FROM l_artist_recording lar 
JOIN link l ON lar.link = l.id 
JOIN link_type lt ON l.link_type = lt.id 
JOIN link_attribute la ON l.id = la.link
JOIN link_attribute_type lat ON la.attribute_type = lat.id
WHERE lt.name = 'instrument' 
GROUP BY lat.name
ORDER BY count(*) DESC;
name count
piano 259005
drums 145927
guitar 138879
violin 100461
bass 87235
trumpet 58697
cello 52158
percussion 49340
keyboard 47182
bass guitar 37654
trombone 35779
tenor saxophone 33763
organ 31263
double bass 28030
electric guitar 26737
harpsichord 24955
viola 23758
flute 21689
saxophone 21389
guest 21364
alto saxophone 21080
strings 21029
acoustic guitar 19213
solo 17243
guitars 16383
clarinet 16360
synthesizer 15424
electric bass guitar 10378
additional 9525
baritone saxophone 9186
oboe 7861
ejhumphrey commented 7 years ago

based on the already awesome efforts of @alastair and @ffont, I put together this instrument taxonomy proposal for further discussion.

https://docs.google.com/document/d/14Dx8X_sjYLZa4TVfdYM0ti83zwVlMuaYdfqqGO8a4Fo/edit?usp=sharing

In the interest of getting on with things and hitting our two-week milestone (https://github.com/cosmir/open-mic/milestone/1), this will be open for a week, at which point we should put together a JSON schema and submit as a pull request.

sound good?

julian-urbano commented 7 years ago

AFAIK, this is the final taxonomy, right?

If so, someone familiar with JAMS please create a JSON file, and I'll publish it, along with an HTML-friendly version, in the website.

Do we have any (sort of) official audio samples for each instrument?

ffont commented 7 years ago

Do we have any (sort of) official audio samples for each instrument?

We could easily find some in Freesound ;) I can contribute finding a number of example sounds from Freesound for each instrument. I guess we should have a page for this with links/embeds to the sounds? We probably also need non-isolated examples of instruments (i.e. in "polyphonic" music). This will be better found outside Freesound.