[13] harmonization: taxonomy, type should be case insensitive

aaronkaplan commented 8 years ago

The eCSRIT II taxonomy values should be case insensitive. Please adapt the code that wherever there is a case sensitive match here, it becomes case insensitive.

dmth commented 8 years ago

This is also discussed on intelmq-dev mailing list. I've no objection here. Nevertheless we should convert every taxonomy value to lowercase, to simplify the processing of events by systems which care for case sensitive values (for example analysis of the eventDB)

bernhardreiter commented 8 years ago

How can we find all code places that have case-insensitive matches?

aaronkaplan commented 8 years ago

On 12 Oct 2016, at 14:12, Bernhard E. Reiter notifications@github.com wrote:

How can we find all code places that have case-insensitive matches?

Ideally it should be documented in the DHO.

sebix commented 7 years ago

Should taxonomy and type be lower case everywhere? Or camel case?

SYNchroACK commented 7 years ago

+1 lower case

aaronkaplan commented 7 years ago

+1 lower case. Noticed that the taxonomy expert did not implement this yet.

sykaeh commented 7 years ago

Can we modify the taxonomy expert so that it converts all of the existing taxonomy values to lowercase as well? That way we can be sure that the classification.taxonomy of all events processed by the taxonomy expert are always lowercase. Right now the taxonomy expert ignores cases where the taxonomy is already set.

ghost commented 7 years ago

Good idea, thanks!

dmth commented 7 years ago

Seems like https://github.com/certat/intelmq/commit/2c6a9b034dd10c499ca26e03d9cac200f1fbf032 was never merged upstream. Was this intentional?

Uh wait, It's b9f070d1fb3790d7642409a22c4ddd015328ace7 in upstream...

dmth commented 7 years ago

Reopening this again sigh

seems like the commit i mentioned was never merged to upstream/master: So this issue is not fixed.

This is how harmonization.conf looks since 2017-02-14

{
    "event": {
        "classification.identifier": {
            "description": "The lowercase identifier defines the actual software or service (e.g. 'heartbleed' or 'ntp_version') or standardized malware name (e.g. 'zeus').",
            "type": "String"
        },
        "classification.taxonomy": {
            "description": "We recognize the need for the CSIRT teams to apply a static (incident) taxonomy to abuse data. With this goal in mind the type IOC will serve as a basis for this activity. Each value of the dynamic type mapping translates to a an element in the static taxonomy. The European CSIRT teams for example have decided to apply the eCSIRT.net incident classification. The value of the taxonomy key is thus a derivative of the dynamic type above. For more information about check [ENISA taxonomies](http://www.enisa.europa.eu/activities/cert/support/incident-management/browsable/incident-handling-process/incident-taxonomy/existing-taxonomies).",
            "length": 100,
            "type": "String"
        },
        "classification.type": {
            "description": "The abuse type IOC is one of the most crucial pieces of information for any given abuse event. The main idea of dynamic typing is to keep our ontology flexible, since we need to evolve with the evolving threatscape of abuse data. In contrast with the static taxonomy below, the dynamic typing is used to perform business decisions in the abuse handling pipeline. Furthermore, the value data set should be kept as minimal as possible to avoid \u201ctype explosion\u201d, which in turn dilutes the business value of the dynamic typing. In general, we normally have two types of abuse type IOC: ones referring to a compromized resource or ones referring to pieces of the criminal infrastructure, such as a command and control servers for example.",
            "type": "ClassificationType"
        },
        "comment": {
            "description": "Free text commentary about the abuse event inserted by an analyst.",
            "type": "String"
},

git branch --all --contains b9f070d1fb3790d7642409a22c4ddd015328ace7
  remotes/certat/certat-feed-doc
  remotes/certat/doc-squelcher
  remotes/certat/fix-subject-lines
  remotes/certat/fix-time-interval
  remotes/certat/fix-time-param
  remotes/certat/master
  remotes/certat/qualle
  remotes/certat/quickhack-squelch-false-positives

ghost commented 7 years ago

It's merge commit 757c7bee3bace587bef352e8402d8ad4f8e5edc8 and commit c97716ac0e2e1bf8894b85646d015a9e550656e9 but they do not touch the taxonomy.

Fixed in master now with f08c4943769240d5f67627bf97c2bfa18a7dd09e and 2e3668ad90810e869347aae7764285ca8131358e

certtools / intelmq

[13] harmonization: taxonomy, type should be case insensitive #670