Closed DavidLeoni closed 7 years ago
I reviewed a bit some classification systems. Full classifications usually have closed license but make available a summary with more permissive licence.
System | License | Summary | Pros | Cons |
---|---|---|---|---|
DDC | closed+ | summary available, not so clear how can be reused | widely used, regularly updated, has just 10 top domains | Closed license, unclear summary reuse terms |
Universal Decimal Classification | closed open ++ | summary available and reusable, RDF available | clear license, summary open, regularly updated, has just 10 top domains, rdf available | Closed license |
Library of Congress | TODO | TODO | TODO | TODO |
Colon Classification | TODO | TODO | TODO | Mainly used in India. Seems the official reference is a book. Even hard to find a reference website... |
BISAC | closed+++ | available but not reusable | TODO | to use it you gotta pay |
+: from Dewey FAQ
++: from UDC licence: "You do not need a licence to use UDC if you want to publish, distribute or use in any other way the 2,600 classes published in the UDC Summary."
+++: BISAC you can freely access the online list, but you need to pay to reuse it in your own databases
Will choose UDC, on the website we will put just top categories
So far I didn't manage to find any serious documentation on WordNet topics strictly speaking. However, there is material on the general topic of domain categories for lexical resources:
It seems that the somewhat ad-hoc original WordNet Domains hierarchy was updated, according to the paper above, to conform better to certain theoretical criteria as well as to existing authoritative document classification schemes, in particular the Dewey Decimal System. The paper above contains an annex describing the first two levels of the updated WordNet Domains hierarchy. The full tree of the updated hierarchy does not seem to be freely available online.
The way to go seems to be to adopt one of the authoritative classification schemes such as the DDC or Colon Classification. The updated WND hierarchy, if we can find it, would be a very good candidate.