Closed ross-spencer closed 7 years ago
thanks @ross-spencer - yep, I'll need to add escaping for the format field (I think just it just quotes the field at present). Looks like a second little bug in the basis field for the tika ID too... the text match info appears twice for some reason.
Was it actually a Monkey's audio in the end, or was that a false positive?
I escaped my code too: https://github.com/exponential-decay/droid-siegfried-sqlite-analysis-engine/issues/39 though there's a better way to do it in Python i need to investigate.
Unfortunately only a false positive, the sig in Freedesktop's file is:
<magic priority="50">
<match value="MAC " type="string" offset="0"/>
</magic>
<glob pattern="*.ape" weight="50"/>
The txt file happens to be an OCR of a PDF that starts with the word MASTERTON, which has been reecognized by the OCR engine as MAC. (A whole stack of things to untangle here!)
Oh! I just noticed I got issue #100 :D (a good day!)
fixed with 1.7.3 release
Related to #30?
I've this block:
I think the single quote need's escaping in "Monkey's Audio"... but am not sure. Do get some errors trying to parse YAML online, e.g. http://yaml-online-parser.appspot.com/ - and original issue was spotted parsing the structure into Sqlite. Will likely add an escape to my code as well, but may catch others out.
PS. I can't help but laugh at the mimetype!