bug mentioned in #100 but adding second ticket to track. Have made a signature file which makes this issue a bit more obvious (added tika, freedesktop and pronom identifiers to a signature file in that order).
Get these results on a text file:
---
siegfried : 1.7.2
scandate : 2017-05-15T11:54:23+10:00
signature : default.sig
created : 2017-05-15T11:52:05+10:00
identifiers :
- name : 'tika'
details : 'tika-mimetypes.xml'
- name : 'freedesktop.org'
details : 'freedesktop.org.xml'
- name : 'pronom'
details : 'DROID_SignatureFile_V88.xml; container-signature-20160927.xml'
---
filename : 'bla.txt'
filesize : 27
modified : 2017-05-15T11:52:37+10:00
errors :
matches :
- ns : 'tika'
id : 'text/plain'
format :
mime : 'text/plain'
basis : 'extension match txt; text match ASCII; text match ASCII; text match ASCII'
warning : 'match on filename and text only; byte/xml signatures for this format did not match'
- ns : 'freedesktop.org'
id : 'UNKNOWN'
format :
mime : 'UNKNOWN'
basis :
warning : 'no match; possibilities based on filename are text/plain'
- ns : 'pronom'
id : 'UNKNOWN'
format :
version :
mime :
basis :
warning : 'no match; possibilities based on extension are x-fmt/111'
In this example, the tika identifier "steals" all the text hits from the subsequent identifiers and reports in own result. Ross's example showed the pronom and tika identifiers both got a text match, so seems that this issue probably in mimeinfo code (i.e. pronom identifiers are not stealing text hits).
bug mentioned in #100 but adding second ticket to track. Have made a signature file which makes this issue a bit more obvious (added tika, freedesktop and pronom identifiers to a signature file in that order). Get these results on a text file:
In this example, the tika identifier "steals" all the text hits from the subsequent identifiers and reports in own result. Ross's example showed the pronom and tika identifiers both got a text match, so seems that this issue probably in mimeinfo code (i.e. pronom identifiers are not stealing text hits).