We are using stixmarx in order to determine the TLP for entities within STIX packages. I noticed that if a node like <FileObj:Size_In_Bytes>3282</FileObj:Size_In_Bytes> is present more than once within a given XML file and said XML file has a STIXHeader that globally applies a TLP marking to all nodes and attributes within the document, only one of the repeated nodes would be given a marking by stixmarx. Per the previous example, only one UnsignedLong object associated with the value 3282 would have a __datamarkings__ attribute. Using get_markings with the passed-in data being a subsequent object with the same value results in an empty list.
This issue can be reproduced with the following XML:
Parsed entities in stixmarx/parser.py are being collected in a set. As a result, any entities that are equal to an entity that exists in the set will be discarded. I'm not sure if this deduplication of entities is the intended behavior, but it appears to produce unintended results. I propose changing the set to a list.
Issue
We are using stixmarx in order to determine the TLP for entities within STIX packages. I noticed that if a node like
<FileObj:Size_In_Bytes>3282</FileObj:Size_In_Bytes>
is present more than once within a given XML file and said XML file has a STIXHeader that globally applies a TLP marking to all nodes and attributes within the document, only one of the repeated nodes would be given a marking by stixmarx. Per the previous example, only oneUnsignedLong
object associated with the value3282
would have a__datamarkings__
attribute. Usingget_markings
with the passed-in data being a subsequent object with the same value results in an empty list.This issue can be reproduced with the following XML:
Solution
Parsed entities in
stixmarx/parser.py
are being collected in aset
. As a result, any entities that are equal to an entity that exists in the set will be discarded. I'm not sure if this deduplication of entities is the intended behavior, but it appears to produce unintended results. I propose changing theset
to alist
.