Closed LinguList closed 4 years ago
Based on the original data, I'd pivot the list so that each word occurs only once, like this:
ENGLISH | POS | SEMANTIC_CLASS | ATTRIBUTES | COORDINATE | EVENT | HYPERONYM | MERONYM | RANDOM |
---|---|---|---|---|---|---|---|---|
alligator | n | amphibian_reptile | aggressive, aquatic, big, carnivorous, dangerous, ferocious, frightening, green, heavy, hungry, large, long, old, scary, wild, young | crocodile, frog, lizard, snake, toad, turtle | attack, bask, breathe, chase, die, drink, eat, frighten, hide, hunt, kill, live, poach, run-, shoot, sleep, swim, walk | animal, beast, carnivore, chordate, creature, predator, reptile, vertebrate | eye, foot, jaw, leg, mouth, scale, skin, tail, tooth | cardiac, constructive, electronic, experienced, impulsive, likely, minimum, possible, previous, social, syntactic, twin, unbelievable, addition, alternative, answer, arrears, clone, contestant, continent, courthouse, dock, handgun, message, methyl, recombination, rectifier, st, teenager, trombone, vitro, administer, admire, conclude, enable, experience, fetch, find, implement, label, propel, redesignated, remember, root, unfurl, view, warn |
Does this structure make sense?
Yes. The separator inside the cell would then be a ,
(comma and space). This can be added as description to the metadata.json, but I keep forgetting how this is expressed. But there is a description on cldf.clld.org, since this is the format we also use for our Segments (but with simple space as separator.
The bless dataset offers some interesting semantic associations (hyperonomy, etc.) for 200 concrete words.
The data may be interesting for the concepticon, since it offers additional accounts on semantic relations between words/concepts.