Notify flag in the DHO - Githubissues

sykaeh commented 8 years ago

I would like to add a "notify" flag to the data harmonization ontology (DHO). This field specifies whether the responsible parties should be notified about this event. Our use case is that an expert then decides based on the attributes of the event (classification.type, classification.identifier, ip, time.source, etc.) whether people should be notified about this event and sets the flag appropriately. This can then be used to filter events when sending notifications/uploading to a database etc. Is anyone else interested in such a flag?

aaronkaplan commented 8 years ago

On 27 Oct 2016, at 18:08, Sybil Ehrensberger notifications@github.com wrote:

I would like to add a "notify" flag to the data harmonization ontology (DHO). This field specifies whether the responsible parties should be notified about this event.

okay. So the notify flag actually already exists in cert.at's version. We were not sure if it is generally relevant. But it seems to be. So how about putting that "upstream" to the main certtools/intelmq repo?

Our use case is that an expert then decides based on the attributes of the event (classification.type, classification.identifier, ip, time.source, etc.) whether people should be notified about this event and sets the flag appropriately. This can then be used to filter events when sending notifications/uploading to a database etc. Is anyone else interested in such a flag?

Excellent. Yes, we have it already.

Anyone else? If so, I'll merge that upstream.

Best, a.

bernhard-herzog commented 8 years ago

We're going to need something similar, but more refined. A simple boolean flag would not enough. We have to be able to decide and record in the event who to notify in what way. Our current plans basically require some JSON structure for this.

aaronkaplan commented 8 years ago

Mobile

On 27.10.2016, at 20:36, Bernhard Herzog notifications@github.com wrote:

We're going to need something similar, but more refined. A simple boolean flag would not enough. We have to be able to decide and record in the event who to notify in what way. Our current plans basically require some JSON structure for this.

So am i hearing that you need some Kind of variable which tracks some state of an event? ( and that notify Boolean would not be enough)?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

dmth commented 8 years ago

So am i hearing that you need some Kind of variable which tracks some state of an event? ( and that notify Boolean would not be enough)?

I'll try to elaborate a little bit: Currently we are "mis-using" the extras field in order to store information about contacts and how they should be informed. This information can contain multiple contacts with a different set of rules. We'd appreciate a field which is intended to contain such information. But this field cannot be a boolean, because we support the notification of multiple contacts.

notify Is a good name, but maybe a little bit to specific for a use-case were this field might contain specific processing information, like provenance of data.

Such a processing-field might contain something like this (fictional structure below):

"processing_info": [
  "dns_lookup_expert": {
    "used_nameserver": "ns1.example.com",
    "lookup_timestamp": 4711
  },
  "flow-monitor": {
    "timestamp": "4712"
  },
  "notifications": [{
     "who": "mail@example.com"
     "how": "e-mail"
     "when": "immediately"
    }, {
     "who": "mail@test.com"
     "how": "e-mail"
     "when": "twice-a-day"
    }
     "who": "mail@test.com"
     "how": "xmpp"
    }
 }
]

sebix commented 8 years ago

I'll try to elaborate a little bit: Currently we are "mis-using" the extras field

You can always extend the harmonization if it makes sense for your setup and there's no usecase for sharing this with other instances.

Concerning your proposal: I think both parts, information from lookup bots and the blown up structure for storing abuse contact data is opposing key principles of intelmq.

How do you mark events as "don't notify"?

Assigning this to @aaronkaplan as it affects the harmonization

dmth commented 7 years ago

You can always extend the harmonization if it makes sense for your setup and there's no usecase for sharing this with other instances.

We doubt that there is no usecase for other instances, that's why we are proposing to create a field which is dedicated for this processing data.

Concerning your proposal: I think both parts, information from lookup bots and the blown up structure for storing abuse contact data is opposing key principles of intelmq.

Although destination.abuse_contact and source.abuse_contact are capable of storing multiple contacts they are not intended to contain information on "how" to inform these contacts, for instance, by XMPP-Message or E-Mail. The preferred way for each contact might differ. In order to achieve this, we need to enhance/blow up the structure within a event, hence one of our goals is to achieve the gathering of these information within IntelMQ in near-realtime, and not doing this in post-processing steps, which would increase the complexity of the whole architecture.

How do you mark events as "don't notify"?

Currently this is done in two ways: First: The contact might be on a Whitelist,In this case the Notification-Interval is set to -1 and a marker indicating the Whitelist entry is appended. Second: The contact doesn't receive informations due to other reasons: In this case only the Notification-Interval is set to -1.

BTW: We don't mark the Event, we mark one or more of the Contacts (plural) which have been associated to an event.

I've learned today that certats notify flag simply marks the event as "do not send it".

I'd like to stick to my proposal from https://github.com/certtools/intelmq/issues/758#issuecomment-257890482 but the field should not contain who is contacted, instead the how is important here. See the following example:

"event": {
  <...a lot of other fields...>
  "source.abuse_contact": "mail@example.com, abuse@test.com",
  "destination.abuse_contact": "example@example.com",
  "extras": {
     "cc_url": "bla.foo.bar"
   },
   "processing": {
      "source.abuse_contact": {        /* <- Processing Info is valid for this field */
         "notification_rules": {  /* <- The specific information which was added by a bot*/
            "mail@example.com": {
                "type": "email",
                "interval": 50
                "sector": "chemical"
            },
            "abuse@test.com": {
                "type": "email",
                "interval": 0
                "sector": "tech"
            }
         }
      },
      "destination.abuse_contact": {        /* <- Processing Info is valid for this field */
         "notification_rules": {  /* <- The specific information which was added by a bot*/
            "example@example.com": {
               "type": "email",
               "whitelisted": True
            }
      }
   } 
}

In this example, three e-mail addresses were determined for an event. Two of them for source one as destination. Within an additional step, rules could be determined by an expert HOW those contacts should be informed. Those rules are the notification interval, the medium or other information. Those rules are appended to the processing-Field.

IMHO a processing field provides a simple way of extending data within the dho with additional information.

SYNchroACK commented 7 years ago

Concerning your proposal: I think both parts, information from lookup bots and the blown up structure for storing abuse contact data is opposing key principles of intelmq.

Yap.

From my perspective, if there is no bot on the main repo that uses notify or other parameters structure, don't think should be added. I see this case as a good example for using the flexibility of harmonization.conf. Each organization creates their own bot to query a Contacts database and add a custom field on harmonization to suits the needs. For information sharing, I guess that fields will not be shared since its kinda an internal field(s).

Please, let's keep with the discussion #TakeItToML

ghost commented 7 years ago

From my perspective, if there is no bot on the main repo that uses notify or other parameters structure, don't think should be added.

No bot uses destination.account, *.local_ip, screenshot_url and event_hash.

SYNchroACK commented 7 years ago

From my perspective, if there is no bot on the main repo that uses notify or other parameters structure, don't think should be added. No bot uses destination.account, *.local_ip, screenshot_url and event_hash.

destination.account: source. fields are reflected to destination. fields for consistence purposes. Since source.account is being used, there is no justification to not have destination.accout.
*.local_ip: inherit from abusehelper harmonization and then reflected to source and destination, , was kept for compatible translation purposes. For more details, check here.
screenshot_url: inherit from abusehelper harmonization, was kept for compatible translation purposes
event_hash: is useful for post-intelmq analysis of events.

@wagner-certat Do you have any other fields where you need some explanation?

@sykaeh Can you send an email to the mailing-list to share your idea and ask for comments from the community? I strongly recommend to not add this field for two main things:

to keep IntelMQ simple for being use for other purposes
to not use IntelMQ as a notification processor because it will add some additional issues regarding notification requirements changes during the time. This notification process (including the add of emails to which events should be send) should be done by another tool outside of IntelMQ, you call it like a IntelMQ-mailer. One scenario that this approach will solve is:

Your IntelMQ is working... you have a bot which adds the notify tag accordingly to notification criteria.... the events then are stored in your database and they are ready to be sent..... but suddenly you received a request from your manager that X type of events cannot be sent... but you already have the notifiy flag defined. You will need to do a manual UPDATE query to DB..... if you have another tool like an IntelMQ-Mailer handling this, the tool will give you the option to change the notification rules that will be applied in the moment of sending, without the need to do manual queries to database. I have more use cases where using IntelMQ pipeline to define that kind of rules can be painful in medium-long term.

But again, send the idea to mailing-list and if people want it, we will do it....

ghost commented 7 years ago

event_hash: is useful for post-intelmq analysis of events.

notify is also useful for post-intelmq analysis of events.

to keep IntelMQ simple for being use for other purposes

We do not add a field which is actually used and requested by users, but keep screenshot URLs?

to not use IntelMQ as a notification processor

Using the notify flag does not imply that we are notifying anyone. We can still fill this inside intelmq, so it can be used afterwards.

ghost commented 7 years ago

@certtools/intelmq-contributors Do we want to do this before 1.0 or can we postpone this?

dmth commented 7 years ago

@wagner-certat thank you for asking. I think this field is needed. Whilst I prefer a JSON field, or something similar, I would not object the implementation of a boolean field.

aaronkaplan commented 7 years ago

okay, timeout for now... nobody seems to care about this strongly. So we will move this to 1.1

ghost commented 5 years ago

To get forward on this topic, here are our thoughts:

For one event there can be multiple non-exclusive recipients
Every recipient can have different preferences

Possible preferences can be:

How:
- The notification type/protocol e.g. email, AMQP, XMPP, HTTP REST, ...
- The format, e.g. IntelMQ JSON (flat/hierarchical), AbuseHelper JSON, CSV (which columns?, separator), IDEA
Who:
- For Mail: To/Cc/Bcc recipient(s), PGP Keys etc. Only one or are multiple recipients allowed? This is the only type where we can have Ccs
- For other things, e.g Hostname, Username & Password, API-Key/Client certificate, ...
When: How often?
- E.g. one summary mail per day
- E.g. all 7 days: Send only, if not already notified in this time frame

So it could be a list of dictionaries. Here are some possible values for the notifications (may be contradicting to different variants)

[
  {
    "recipient_to": "abuse@example.com",
    "pgp_fingerprint": "ABC123",
    "interval": 15
  },
  {
    "recipient_to": "abuse@example.com,abuse@example.net",
    "recipient_cc": "abuse@example.org",
    "interval": 0,
    "format": "csv"
  },
  {
    "recipient_cc": "abuse@example.at",
    "interval": 3600,
    "s_mime": "asd"
  },
  {
    "type": "amqp",
    "host": "amqp.example.com",
    "client_certificate": "file content?",
    "amqp_channel": "from_intelmq"
  },
  {
    "type": "http_rest",
    "http_url": "https://example.com/intelmq/push",
    "http_basic_auth_username": "user",
    "http_basic_auth_password": "pass"
  },
  {
    "type": "http_rest",
    "http_uri": "https://usre:pass@example.net/intelmq/push"
  },
  {
    "type": "xmpp",
    "host": "xmpp.example.com",
    "username": "user",
    "password": "pass",
    "xmpp_room": "from_intelmq"
  },
  {
    "type": "xmpp",
    "host": "xmpp.example.com",
    "username": "user",
    "password": "pass",
    "xmpp_receiving_user": "abusehelper",
    "format": "abusehelper"
  }
]

Open questions I have in mind:

Is this per source and destination as @dmth proposed?
For every notification recipient we can (and want) to save when the recipient has been notified. E.g. a time stamp and/or ticketing system numbers.
Do we want to have this in a more generic processing information field as @dmth proposed?

I don't see a need for having a complete specification for this, there can be flavors and this is totally fine. I think we can or should agree on the outer structures only and be as compatible and similar as reasonable for the rest.

cc @bernhardreiter

navtej commented 5 years ago

To send email notifications you need to configure SMTP server and other items. Similar for rest of the notification facilities. However we already have all these as output bots. Wouldn't a UI abstraction, around existing bots, to create notification group, be a viable idea?

We can give a UI facility which when configured with parameters, will create couple of output bots alongwith a sieve filter bot. Sieve filter bot can be pre-configured with the required filtering, abstracted by UI, to send appropriate message to correct bot.

Thoughts?

otmarlendl commented 5 years ago

I may be completely off the point here, but somehow this problem reminds me of the logging configuration of the Bind name server. See e.g. https://kb.isc.org/docs/aa-01526

Basically: there you define channels, which can take events (incl. protocol, data format, restrictions on severity, ...), and then you define which category of event gets distributed to which channel.

Translated to IntelMQ: maybe it is a good idea not to replicate the full routing information in each event, but reference transmission channels you define somewhere else.

ghost commented 5 years ago

I forgot here to summarize the reasons and use-cases for this kind of notification/routing information, sorry.

In non-trivial setups, we need to determine where the event needs to go to. In the simplest model, the destinations are fixed all the time and can thus be set in the configuration. But in other cases, the destinations are dynamic or defined elsewhere, as I said early, contact databases are a good example here.

So we are actually talking (more generically) about routing information.

Organizations like ours, but others too, have big contact databases, not only different contacts, but also different settings, delivery protocols, time settings. E.g. you want to notify the relevant contact for infected PCs more often than the server admins about some blacklisting (first time is usually enough here). Or for some other things you maybe want immediate notification. Then, the delivery differs, some may want JSON, others are fine with CSV. source.abouse_contact is only practicable for email and not sufficient either, but it is very useful for many cases. And further the abuse_contact does not fit for any non-email destination.

My vision is that any expert can set the routing information for an event. The output can then directly set the destination according to the information in the event itself. This allows for much more flexibility. E.g. the AMQP or XMPP outputs can use the host/username/password triples from the event. Otherwise that needs a filter expert / output bot tuple preconfigured in advance. And that is not doable.

Two existing contact database solutions in the ecosystem of intelmq are intelmq-fody+intelmq-mailgen and do-portal.

I hope it is more clear now, why there is need for this kind of information.

bernhardreiter commented 5 years ago

Hi, thanks for picking up the discussion. Overall I believe that the topic from the original request as been growing into a major design discussion. I don't think it is feasibly in this issue. Thus I suggest to close this issue and bring the topics up elsewhere and start from use cases, go to usage epics and the overall target audience of IntelMQ.

A few remarks:

After our request we have been making in 2016 (by @dmth and @bernhard-herzog), we have solved the problem in https://github.com/Intevation/intelmq-mailgen-release (aka intelmq-cb-mailgen by creating a process with several steps.
- First we look up all contact information we have and add them to the event (for destination and source).
- Second we use a rule bot to condense which potential contacts will actually be notified and how, which replaces the contact information (at least partly).
- We write it into the postgresql DB because we need a record.
- We use a mailer to accumulate information for potential combined notifications by several rules.

See a rough description (and diagram) in https://github.com/Intevation/intelmq-mailgen/tree/master/docs .

There is also a separation of access, as the contactdb content can be filled by people with less priviledges, the idea is that a user error here will not stop the complete system. Rule for the rule bots and the mailgen can be provided as stacked python scripts by machine administrators. The aim was to make it flexible and robust at the same time and it works well now for many months. (Of course there is always room to see if this is a good enough choice how everything is build.)

If learning from the two existing solutions (intelmq-cb-mailgen and the other) is the goal to have IntelMQ unify more of this functionality, then both should be studied in detail to see what can be learned from them.

One guess is that IntelMQ should come with a default way to process emails, as the accumulation process seems to be a common use case with abuse notification. A potential nice vision for IntelMQ would be that its default setup would notify nicely for a country or company CERT out of the box by email or otherwise.

As you can see a real world meeting, discussing the vision of IntelMQ and then going into the epics and how IntelMQ will address this vision from the technical structure probably is the best way to go on this, this issues isn't. ;)

certtools / intelmq

Notify flag in the DHO #758