We should consider refactoring the existing HashType in CybOX Common, as there are several issues around it:
The structure is overly verbose and heavyweight for the capture and parsing of ubiquitous types of hash values such as MD5, SHA1, and SHA256; it is arguable that these are by far the most prevalent types of hashes in cyber threat related characterization today. Currently, users must first specify the correct value from the default HashNameVocab vocabulary, populate the Type field with this value and set its xsi:type to point to the vocabulary, and then finally populate the Simple_Hash_Value field with the actual hash value:
The structure has separate fields for capturing simple and fuzzy hash values (Simple_Hash_Value and Fuzzy_Hash_Value, respectively), both fundamentally string values. This seems an unnecessary distinction, as simply specifying the type of a hash (e.g., SSDeep) provides the necessary context for identifying it as simple or fuzzy.
Patterning against the structure is semantically confusing, since a pattern must be written against both the Type and *_Hash_Value fields.
We should consider refactoring the existing
HashType
in CybOX Common, as there are several issues around it:HashNameVocab
vocabulary, populate theType
field with this value and set its xsi:type to point to the vocabulary, and then finally populate theSimple_Hash_Value
field with the actual hash value:Simple_Hash_Value
andFuzzy_Hash_Value
, respectively), both fundamentally string values. This seems an unnecessary distinction, as simply specifying the type of a hash (e.g., SSDeep) provides the necessary context for identifying it as simple or fuzzy.Type
and*_Hash_Value
fields.