Open odanoburu opened 5 years ago
I tend to prefer 2
that's my least favorite option from the implementation point of view, since it introduces more ad hoc things. in the wordsense and synset datatypes we've defined we already have fields for frames, while all realations are lumped together in one field. plus we only have markers in a few adjectives, but all wordsenses would end up having this field -- unless we can think of a better representation. i don't like the idea of treating adjectives specially, but maybe that's one way to go..
data WNWord = WNWord WordSenseIdentifier [FrameIdentifier] [WordPointer]
deriving (Show,Eq)
-- synsets can be
data Unvalidated
data Validated
data Synset a = Synset
{ sourcePosition :: SourcePosition
, lexicographerFileId :: LexicographerFileId
, wordSenses :: NonEmpty WNWord
, definition :: Text
, examples :: [Text]
, frames :: [Int]
, relations :: NonEmpty SynsetRelation
} deriving (Show,Eq)
(from https://wordnet.princeton.edu/documentation/wninput5wn)
how to represent them in the text format?
I think they are similar to frames, so we could encode them as such..? (1.) or should we include them as another ad hoc thing, like frames, but with its own name? (2.) or should we just put this information in a separate file? (3.) (I'm thinking we might want to have a few of those anyway, so this information could be shown in the emacs mode and even be editable there)
1.
2.
3.