This issue contains information about statistics in the Norwegian Wordnet (Bokmål) from the National Norwegian Library.
At the initial stage, statistics was made on the official dataset from the National Library in general, showing distribution of number of examples, pos tags, and senses per lemma.
At this stage, more detailed statistics are carried out, including the following points:
the statistics of distribution of unique sentences for the given lemma: choice was concentrated on lemmas that could provided 5 or more sentences through the dataset.
it is proposed to divide words into categories depending on the number of possible senses.
Providing modification in .rdf files in the Wordnet. The mistakes in early sample of modification are removed. The scripts after removing errors can be found here .
This issue contains information about statistics in the Norwegian Wordnet (Bokmål) from the National Norwegian Library.
At the initial stage, statistics was made on the official dataset from the National Library in general, showing distribution of number of examples, pos tags, and senses per lemma.
At this stage, more detailed statistics are carried out, including the following points:
the statistics of distribution of unique sentences for the given lemma: choice was concentrated on lemmas that could provided 5 or more sentences through the dataset.
it is proposed to divide words into categories depending on the number of possible senses.