Add support for document-level key-value metadata. I imagine something like this:
=== Variant 1
MetaDataEntry extends Annotation {
String: key
String: value
}
// Simplest option only allowing String key-value pairs
=== Variant 2
// Option only allowing basic typed key-value pairs with values represented as strings
// The type would be set if the value is not a string - and it would be set e.g. to `int`, `bool`, etc.
MetaDataEntry extends Annotation {
String: key
String: value
String: type
}
=== Variant 3
// Rather have everything in one FS; either value or ref would be set, but not both
// If ref is set, then values would be retrieved from the linked FS (key-values again)
MetaDataEntry extends Annotation {
String: key
String: value
FeatureStructure: ref
String: type
}
=== Variant 4
// Full support for all kinds of structures, even nested entries - basically "schemaless"
MetaDataEntry extends Annotation {
String: key
}
PrimitiveMetaDataEntry extends MetaDataEntry {
String: value
String: type
}
MetaDataEntryGroup extends MetaDataEntry {
MetaDataEntry[]: items
}
Instead of adding the MetaDataEntry to a view, adding it to a list of MetaDataEntry that could be created on DocumentMetaData:
DocumentMetaData extends DocumentAnnotation {
// ... all the stuff we already have in DocumentMetaData ...
MetaDataEntry[]: entries
}
Alternative to extending Annotation would be to extend TOP and then only adding it to DocumentMetaData and not to the CAS view directly. That would mean that the MetaDataEntry could not be retrieved via the annotation index / via offsets. But it is expected that the offsets would always cover the whole document anyway. This could be a problem and require special handling if the annotations are added before the text is materialized; the respective code would have to know that all the MetaDataEntry annotations would need to be updated to match the materialized text in the end. UIMA handles this automatically for us for the DocumentAnnotation.
Add support for document-level key-value metadata. I imagine something like this:
Instead of adding the
MetaDataEntry
to a view, adding it to a list ofMetaDataEntry
that could be created onDocumentMetaData
:Alternative to extending
Annotation
would be to extendTOP
and then only adding it toDocumentMetaData
and not to the CAS view directly. That would mean that theMetaDataEntry
could not be retrieved via the annotation index / via offsets. But it is expected that the offsets would always cover the whole document anyway. This could be a problem and require special handling if the annotations are added before the text is materialized; the respective code would have to know that all theMetaDataEntry
annotations would need to be updated to match the materialized text in the end. UIMA handles this automatically for us for theDocumentAnnotation
.