diachron / quality

Dataset Quality Assessment (part of WP5 of the Diachron EU FP7 project)
MIT License
8 stars 4 forks source link

EmptyAnnotationValue metric #32

Closed clange closed 10 years ago

clange commented 10 years ago

Implement a metric EmptyAnnotationValue (in the category of Representational dimensions; Understandability dimension) that identifies triples whose property is from a pre-configured list of annotation properties, and whose object is an empty string.

We consider the following widely used annotation properties (labels, comments, notes, etc.):

For now, this list of properties can be hard-coded; we might think about a more extensible implementation later.

E.g. a triple like the following should be matched:

<http://...> <http://www.w3.org/2000/01/rdf-schema#comment> "" .

The metric value is defined as the ratio of annotations with empty objects to all annotations (i.e. all triples having such properties).

(Background: D3.1 Table 20 on page 91)

Cc: @nfriesen

muhammadaliqasmi commented 10 years ago

EmptyAnnotationValue metric implementation consider widely used annotation properties like labels, comments, notes, etc and its identifies triples whose property is from a pre-configured list of annotation properties, and whose object is an empty string. The list of widely used annotation properties are stored in ..src/main/resources/AnnotationPropertiesList.txt

--implemented in issue#32 branch --issue#32 branch merged with master branch