datascience / c3po

Clever, Crafty Content Profiling of Objects
http://ifs.tuwien.ac.at/imp/c3po
Apache License 2.0
4 stars 3 forks source link

C3PO saves only partial information about conflicts #59

Closed artourkin closed 8 years ago

artourkin commented 8 years ago

For the following identification conflict (taken from FITS output)

<identification status="CONFLICT">
    <identity format="Extensible Markup Language" mimetype="text/xml" toolname="FITS" toolversion="0.6.0">
      <tool toolname="Jhove" toolversion="1.5" />
      <tool toolname="OIS XML Metadata" toolversion="0.2" />
      <version toolname="Jhove" toolversion="1.5">1.0</version>
    </identity>
    <identity format="XMP" mimetype="text/xml" toolname="FITS" toolversion="0.6.0">
      <tool toolname="Exiftool" toolversion="7.74" />
    </identity>
  </identification>

C3PO will store this value:

"format" : { "values" : [ "Extensible Markup Language", "XMP" ], "sources" : [ "0", "2" ], "status" : "CONFLICT" }"

This means that C3PO saves only a first tool reporting a property value in a conflicting situation. This needs to be fixed.

artourkin commented 8 years ago

To resolve this issue the data model should be changed, as well as all map reduce queries.