utoni / nDPId

Tiny nDPI based deep packet inspection daemons / toolkit.
GNU General Public License v3.0
67 stars 15 forks source link

nDPId: Add simil-netflow, UDP-based outgoing stream support #3

Closed verzulli closed 2 years ago

verzulli commented 2 years ago

I'm interested in the possibility, for nDPId, to directly send out the JSON-stream of events, via UDP, to a remote host.

I'm thinking to a behaviour much similar to NetFlow/IPFIX, that I'm succesfully using in OpenWRT (at home, inside my WDR4300 wireless router), collecting flows with softflowd and relaying them to a remote location, via UDP. As such, I'm able to off-load/enrich netflow analysis, with no technical constraint. Indeed: at my remote location, I'm enriching received flows with geo-referential data (provided by MaxMind free library ) and pushing them to an opensearch instance.

I'm trying to further enrich my data with high-level protocol information (provided by lib-nDPI), and nDPId fit perfectly in such a role. The only missing bit is the possibility to stream out flows, directly.

Of course I can run a local "gateway" (fetching from nDPId and writing to remote location) but this is not easy, as the whole stuff need to be run inside OpenWRT boxes, that are VERY resource-constrained (BTW: libndpi and softflowd are already packaged for OpenWRT) and... I'm lacking C/C++ knowledge :-(

Is this an interesting feature, for the nDPId project?

BTW: thanks for developing nDPId

utoni commented 2 years ago

If I understood you correctly, you want nDPId to not listen on an interface and waiting for packets to process, instead it should be able to process packets received as NetFlow/IPFIX via UDP? That is possible, but I am lacking knowledge about NetFlow and IPFIX. Is there any widely used NetFlow/IPFIX library available that I can use within nDPId?

But a good idea in general. On my OpenWrt devices, nDPId uses approximately 7 - 20 MB of RAM depending on the amount of active flows. I am working on a solution to reduce memory usage even more. There is by the way already a Makefile for OpenWrt: https://github.com/utoni/openwrt-packages/tree/package/nDPId-master

verzulli commented 2 years ago

Instead of connecting to an existing UNIX-domain-socket, I need nDPId to connect to a remote (configurable) UDP-socket and simply send there EXACTLY the same JSON-serialization-stream that it is currently sending to the UNIX-domain-socket. In short: I need to retrieve nDPId flows, via a remote NodeJS backend (listening on an UDP socket) and NOT from a "local" application, running side-by-side with nDPId.

This is UNRELATED to Netflow/IPFIX: sorry for the misunderstanding! I mentioned Netflow/IPFIX just to say that in Netflow scenarios, the agent runs locally on the system and send the flows (Netflows....) via UDP port. But, again, Netflow it's unrelated to my desire (BTW: I'm already fetching/enriching real-netflow stream, with softflowd and I'll keep continuing getting them, regardless of nDPId).

Again: I'd like nDPId still send the current, existing, JSON-serialization stream, without any further modification.

Thanks for replying!

verzulli commented 2 years ago

But a good idea in general. On my OpenWrt devices, nDPId uses approximately 7 - 20 MB of RAM depending on the amount of active flows. I am working on a solution to reduce memory usage even more. There is by the way already a Makefile for OpenWrt: https://github.com/utoni/openwrt-packages/tree/package/nDPId-master

Uau! I'm going to try to build the package really soon! Thanks for mentioning this.

BTW: my journey with OpenWRT packaging is... a bit long: maybe you will find useful what I wrote, lots of time ago, about this topic . Feel free to link, if you want (or cut/paste, or whatever) :-)

utoni commented 2 years ago

Instead of connecting to an existing UNIX-domain-socket, I need nDPId to connect to a remote (configurable) UDP-socket and simply send there EXACTLY the same JSON-serialization-stream that it is currently sending to the UNIX-domain-socket. In short: I need to retrieve nDPId flows, via a remote NodeJS backend (listening on an UDP socket) and NOT from a "local" application, running side-by-side with nDPId.

The implementation effort implementing UDP/TCP support for the collector sink shouldn't be that much. I will work on that.

utoni commented 2 years ago

With PR #4 it is possible to use custom UDP endpoints for nDPId. UDP endpoints can be specified with e.g. nDPId -c 127.0.0.1:7777.

Any feedback appreciated.

verzulli commented 2 years ago

I've just been able to package the nDPId-UDP-endpoint branch of nDPId for my OpenWRT box, and try it.

I properly received UDP-messages, even if I saw lots of error in the console.

I fired the binary with: /usr/sbin/nDPId-master -i br-lan -l -c 192.168.0.128:9999 and received them on my Linux box (192.168.0.128) with a common nc -u -l -p 9999.

I'm attaching:

Please, let me know if those are enough information for your "testing requirements" or if you need further informations.

If you confirm that the error messages in console are not important, I'll start working on the "real" receiver side, to process UDP-stream "in production".

Thanks a lot for your support!

utoni commented 2 years ago

Strange. Can you either try: /usr/sbin/nDPId-master -i br-lan -l -c 192.168.0.128:9999 -o max-reader-threads=1 or nc -kluvw 1 192.168.0.128 9999 and tell me if that works? Not sure if this is related to nDPId or netcat, but the error message is definitely not helpful. So I should fix at least that.

// EDIT: It is a netcat issue.,-k fixes it. nDPId uses one socket per thread, so basically one host per thread from a netcat point of view.

// EDIT: I've updated and improved error messages if the connection/datagram is refused by the endpoint

verzulli commented 2 years ago

I'm succesfully running:

/usr/sbin/nDPId-master -i br-lan -c 192.168.0.128:9999

on my OpenWRT box, since two days ago... with no problem at all.

Furthermore, I've just launched a NodeJS-based UDP-server on 192.168.0.128 and... being able to properly receive the feed.

Some rough counters follows ("evt"s are mine; "flows" and "packets" are the one coming from nDPId):

{
  "evt": {
    "flows": 18724,
    "pkt": 43306,
    "others": 6661
  },
  "flows": {
    "info/new": 4877,
    "info/detected": 4812,
    "info/detection-update": 3626,
    "finished/end": 1625,
    "finished/idle": 3048,
    "finished/update": 508,
    "info/not-detected": 41,
    "info/guessed": 70,
    "info/end": 65,
    "info/idle": 35,
    "info/update": 17
  },
  "packets": {
    "packet-flow": 36706,
    "packet": 6600
  }
}

So I'd say.... it's definitely working!

(...and now I can succesfully start investigating what those events represents.... [other than info/detected ones] :-))

utoni commented 2 years ago

I'm succesfully running:

/usr/sbin/nDPId-master -i br-lan -c 192.168.0.128:9999

on my OpenWRT box, since two days ago... with no problem at all.

Furthermore, I've just launched a NodeJS-based UDP-server on 192.168.0.128 and... being able to properly receive the feed.

Some rough counters follows ("evt"s are mine; "flows" and "packets" are the one coming from nDPId):

{
  "evt": {
    "flows": 18724,
    "pkt": 43306,
    "others": 6661
  },
  "flows": {
    "info/new": 4877,
    "info/detected": 4812,
    "info/detection-update": 3626,
    "finished/end": 1625,
    "finished/idle": 3048,
    "finished/update": 508,
    "info/not-detected": 41,
    "info/guessed": 70,
    "info/end": 65,
    "info/idle": 35,
    "info/update": 17
  },
  "packets": {
    "packet-flow": 36706,
    "packet": 6600
  }
}

So I'd say.... it's definitely working!

Awesome!

(...and now I can succesfully start investigating what those events represents.... [other than info/detected ones] :-))

I've added another PR (#5) which adds some more documentation in the README.md about events and flow states. If you have any questions or suggestions, please do not hesitate to contact me.

P.S.: I am interested in your work. If it will be FOSS, it might be a good idea to provide a link in the README or even add it to the examples.

verzulli commented 2 years ago

[...] P.S.: I am interested in your work. If it will be FOSS, it might be a good idea to provide a link in the README or even add it to the examples. [...]

It's DEFINITELY F/OSS (GPL-v2 license, at the moment). I'm developing it within our own community environment (we're running a self-hosted & internal GitLab instance - BTW: I'm speaking about GARRLab infrastructure ). Not 'cause we're gelous about what we do.... but simply 'cause we're not 150% sure to NOT make mistakes, and start sending out some security-related info (access tokens, password, and so on).

Anyway, as you requested it... here it is: I've added ad additional "remote" to my local git-repo, and current nDPId-rt-analyzer has just been pushed to public GitLab. Feel free to do with it whatever you want :-)

BTW: I'll (re)start giving a look to your documentation, soon. Thanks.

verzulli commented 2 years ago

[...] I've added another PR (#5) which adds some more documentation in the README.md about events and flow states. If you have any questions or suggestions, please do not hesitate to contact me. [...]

Your documentation about EVENTS (expecially PACKET-related and FLOW-related ones) and STATES (flow-states) are very clear. I'll carefully re-read them, to understand how integrate them in the rt-analyzer.

Based on such documentation, unfortunately, I saw that LOTS of the UDP-streaming are related to PACKET-events (as, if I understand correctly) for every single packet processed by nDPId/libNDPI, a packet-event is streamed out...

From my point of view, this is a problem, 'cause: 1) I'm NOT really interested in "packet events", as they really give me close-to-nothing information; 2) in medium/large environment (> 100 hosts), such a streaming consume LOTS of bandwidth.

I can surely filter events on the receiving side... but bandwidth problem will still be present.

Of course, based on your initial approach (reading LOCALLY via UNIX-socket, and filter LOCALY with nDPIsrvd) would easily solve such a problem but.... it's difficult, for me, to also run nDPIsrvd locally, on OpenWRT.

An easily "fix" would be an option, passed as an argument to nDPId enabling/disabling PACKET-flows. With such an option, I'd simply turn-off packet-flow generation and... everything will be solved :-)

...but I cannot help, you, on this side, as I'm not exactly a C/C++ developer :-(

utoni commented 2 years ago

Ok, I need to extend the README and provide some information about "fine tuning" which can be done via nDPId -o .... What you basicially want is nDPId -o max-packets-per-flow-to-send=0. Using this option should disable all packet events. I should make this clear in the documentation.

utoni commented 2 years ago

Is that ok for you if I add ndpid-rt-analyzer as git submodule to ./examples ?

verzulli commented 2 years ago

Is that ok for you if I add ndpid-rt-analyzer as git submodule to ./examples ?

No problem at all! I appreciate it!

BTW: I took the chance to even improve the documentation... (and push some more commit, as I added some new feature)

verzulli commented 2 years ago

BTW: an IMPORTANT detail. Please note that I changed the license, from GPL-v2 to A-GPL, as I really want ndpid-rt-analyzer to stay "public", even in case of its adoption within some "server-side" web-service. With A-GPL, such a behaviour (using it, on server-side, without releasing the code to the public), will be forbidden (aka: using it "on-the-server", stick to general GPL requirements).

Let me know if this is OK for you

utoni commented 2 years ago

Sure. I fully agree with this. =)