CybOXProject / schemas

CybOX Schemas and Schema Development
42 stars 17 forks source link

Lists in CybOX Object fields #380

Open c-x opened 8 years ago

c-x commented 8 years ago

Another fact is that CybOX allows the notation ‘##’ as an attempt to describe a list of objects. I think this notation is all except efficient nor convenient, so I see 2 options here:

1- We don’t need lists, the standard already allows to describe multiple objects of the same nature multiple times within the same IOC file, so no need at all of this in the standard. So, no lists are needed, this notation disappear.

2- We need lists, which means we need a proper object to handle lists, and not a trick like the current notation is. For example, something like ,, ... (or maybe the “relatedTo” could do the job?)

ikiril01 commented 8 years ago

Agree that lists need revisiting and that the current implementation is painful to deal with. I think the core issue is whether we need to support list-based indicator matching (e.g., matching against a list of File Names), or whether this should instead be performed using another method (e.g., Boolean composition).

packet-rat commented 8 years ago

While using compound indicators is overly verbose when one wishes to pass just a list of objects without any substantive context, I'm not a fan of the latter. For this reason and in support of those making [good] arguments for narrowing the variant forms of expression [+1] on eliminating this somewhat "chunky" list representation form.

Patrick Maroney (609)841-5104

On Aug 5, 2015, at 5:29 PM, Ivan Kirillov notifications@github.com wrote:

Agree that lists need revisiting and that the current implementation is painful to deal with. I think the core issue is whether we need to support list-based indicator matching (e.g., matching against a list of File Names), or whether this should instead be performed using another method (e.g., Boolean composition).

— Reply to this email directly or view it on GitHub.

JasonKeirstead commented 8 years ago

I agree the current list notation is horrible. I also agree the overhead of compound indicators for something like an IP watch list would be very problematic when we're trying to compact these protocols, not make them more verbose.

A possible solution is to just add an optional "list_delimiter" attribute to objects to let people pick their own delimiters. The default delimiter could be left at "##"

johnwunder commented 8 years ago

@JasonKeirstead that's actually exactly how it is now :)

JasonKeirstead commented 8 years ago

@johnwunder It is? Well color me surprised :) I have never seen anyone use anything but the ## delimiter in any example.

Well, my work here is done!

ikiril01 commented 8 years ago

Yup! As @johnwunder mentioned, we have exactly that already with the existing "delimiter" attribute (see http://stixproject.github.io/data-model/1.2/cyboxCommon/BaseObjectPropertyType/). Probably something that we never explained terribly well, so it's not surprising that everyone uses "##" by default :)

One of the other side effects of lists as they are currently implemented is that they don't allow for field-level data validation (e.g. of MAC addresses), since every field MUST be capable of holding lists. This would prevent atomic Object validation as suggested in #379.

c-x commented 8 years ago

Processing wise, it will be way faster to process 2 objects rather than 1 object which may or may not have a list with '##' notation.

And this doesn't allows different objects type in the same list. Not sure if its needed or there is any meaning of permitting it, but we can imagine the following list object:

<cybox:ListObject id="example:Object-dae8802e-b0df-4989-9ac3-d816b153842b">
    <cybox:Properties xsi:type="FileObj:FileObjectType">
        <FileObj:File_Name pattern_type="Regex">bad_file[0-9]{2,5}\.exe</FileObj:File_Name>
    </cybox:Properties>
    <cybox:Properties xsi:type="AddressObj:AddressObjectType" category="ipv4-addr">
        <AddressObj:Address_Value>199.192.156.134</AddressObj:Address_Value>
    </cybox:Properties>
</cybox:Object>

This list discussion will probably be better served under STIX standard as this is the layer that will provide the boolean logic? In my understanding, CybOX is how to describe atomic objects and STIX is how to link them together.

JasonKeirstead commented 8 years ago

The processing may be faster but the bulk of the XML will be far greater. Imagine having a watchlist with 10,000 entries in this format.

Couldn't we just allow multiple value objects to create the list?

IE

<cybox:Properties xsi:type="AddressObj:AddressObjectType" category="ipv4-addr">
        <AddressObj:Address_Value>199.192.156.134</AddressObj:Address_Value>
        <AddressObj:Address_Value>199.192.156.135</AddressObj:Address_Value>
        <AddressObj:Address_Value>199.192.156.136</AddressObj:Address_Value>
</cybox:Properties>
c-x commented 8 years ago

I don't think the size of XML files is an issue.

I run a quick test to get numbers:

On the other hand, we can embed full binary in base64 in STIX/CybOX. Plus is is tied to the XML format, it will be lighter in JSON I presume.