robcowart / elastiflow

Network flow analytics (Netflow, sFlow and IPFIX) with the Elastic Stack
Other
2.49k stars 596 forks source link

DNS resolving not working #683

Closed Jesper80 closed 3 years ago

Jesper80 commented 3 years ago

I have installed the latest version of elastiflow on Ubuntu 20.4 LTS and I can not get DNS reverse resolve to work.

The system is running as a VM on Unraid, and has 32G RAM with 4G java to logstash, and 12G to elasticsearch. Kibana is also installed according to robcowarts youtoube video. Apparmor and ufw is disabled, and elasticstack is the only thing running on the system.

Everything is more or less set up to the defaults and elastiflow is working fine except for the DNS resolve and the appID for pfsense with softflowd (if anyone has advice of how to get this working please don't hesitate to let me know)

On my host system I can run a "host " and get a reverse resolve from my pfsense DNS server, as well as resolve an external one, but elastiflow just shows the ip adresses.

in my /etc/systemd/system/logstash.service.d/elastiflow.conf I have:

Environment="ELASTIFLOW_RESOLVE_IP2HOST=true" Environment="ELASTIFLOW_NAMESERVER=127.0.0.1"

It doesn't matter if I change the nameserver to the local pfsense DNS (both answer correctly with nslookup from the command line of the host system).

A clue that I might have found is that I don't seem to have any of the enviroment variables set in the system.

printenv ELASTIFLOW_RESOLVE_IP2HOST and printenv ELASTIFLOW_NAMESERVER or any other variables in the elastiflow.conf, just returns a blank statement.

/var/log/logstash/logstash-plain.log doesn't show any errors and says that it is started.

systemctl status looks like this: jesper@elk:~$ systemctl status logstash ● logstash.service - logstash Loaded: loaded (/etc/systemd/system/logstash.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/logstash.service.d └─elastiflow.conf Active: active (running) since Mon 2021-01-25 00:53:36 UTC; 40min ago Main PID: 8095 (java) Tasks: 169 (limit: 38429) Memory: 4.3G CGroup: /system.slice/logstash.service └─8095 /usr/share/logstash/jdk/bin/java -Xms4g -Xmx4g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOn>

Jan 25 00:55:01 elk logstash[8095]: [2021-01-25T00:55:01,461][WARN ][logstash.codecs.netflow ][elastiflow][e05f3254de6bd67a513a705bf17a338b50ed37230f6a5299a9c25> <several similar warnings that goes away after a min, all according to the known error posted>

Jesper80 commented 3 years ago

Sorry, the last message got cropped, here is the full version from logstash-plain.log

[2021-01-25T00:55:01,461][WARN ][logstash.codecs.netflow ][elastiflow][e05f3254de6bd67a513a705bf17a338b50ed37230f6a5299a9c256d6402a4371] Can't (yet) decode flowset id 1024 from source id 0, because no template to decode it with has been received. This message will usually go away after 1 minute.

But I guess this has nothing to do with my issue, and is reported as a known issue.

robcowart commented 3 years ago

You specified the nameserver as 127.0.0.1. Does the VM actually have a name server listening on the localhost IP address?

Jesper80 commented 3 years ago

Thanks for your reply robcowart. I must have seen ghosts when I did my testing before. localhost did not have a functional DNS resolver. Changing the field to the pfsense DNS solved the problem, and now it does lookups on both internal and external sources.

Do you have any idea of how to get AppID for pfsense with softflowd to work? I have been unable to find a list of AppIDs through google searches. Could you maybe advice me of how to grab the required data from the firewall?

robcowart commented 3 years ago

I am not aware that pfsense uses any kind of application detection technology.

Jesper80 commented 3 years ago

okey, thanks for your help and for a great product

Jesper80 commented 3 years ago

Btw, would the Unifi USG support application ID?

I was thinking I could maybe get one of those, and connect to a port mirror port on the switch, or on a separate mirrored interface in pfsense, just to provide the fancy unifi dashboards, and use it to provide netflow to elastiflow as well?

Or is there any other opensource application that I could run on a pi or something, with appID support, that could be used for the same purpose?

jedd commented 3 years ago

Almost definitely not in the sense that you likely want.

You can look at a couple of definitions on the USG (though these are EdgeRouter commands, they appear to work on Unifi gear, at least for the definitions - I haven't used DPI firewall rules in anger on USG's).

eg.

sudo /usr/sbin/ubnt-dpi-util show-cat-apps Social-Network

or

sudo /usr/sbin/ubnt-dpi-util show-cat-apps Streaming-Media

A year ago - and seemingly still today - you can't modify the built-in, or add new, groups.

If you want NBAR-alike you'll have to go for one of the bigger brands AFAIK. You could fancy up some ingest rules to tag traffic, based on destination port and addresses, but that breaks if you've got multiple applications served by a common LB. Then you're back to DPI (with accompanying HTTPS challenges).

What's the actual problem you're trying to solve, btw?

Jesper80 commented 3 years ago

Thanks for your reply.

I am very happy with my pfSense firewall for the actual routing and rule filtering, and I intend to keep it that way.

What I'm trying to achieve is just to get application ID reporting added to this setup for elastiflow. The softflowd package for pfSense doesn't seem to support it. Since I use unifi for my switches and APs, I thought that if a USG would support this, I could combine this appID functionallity with getting some nice but rather useless statistics in my unifi dashboard.

I have seen some people putting a USG in bridge mode in series with their pfSense, just for the dashboard DPI functionality. This is unfortunately not an option for me, since the pfSense has some 10G sfp+ fiber interfaces, and the USG would slow down the whole network.

So I thought of putting something parallel to the firewall, maybe on a switch port forward port, or as a port forward pfSense interface, just to do the netflow and appID reporting to elastiflow. If the USG is not suitable for this, the unifi dashboard is not that important. And if you have any recommendation for a device that would just be able to do the netflow and application ID reporting, with disabled firewall functionality, I would appreciate it.

jedd commented 3 years ago

If it's reporting only, then I'd try massaging the netflow data at ingest time - protocol / dest.addr / dest.port - will get you a long way towards identifying applications, though managing those long term will be a bit of a pain. If it's for enterprise use, then it's a bit more palatable. This is on my list to look at, but got there yet (the earlier reference to LB's is a major problem in our environment as almost all our traffic traverses HTTPS-terminating F5's).

Does pfSense let you do application ID? Googling those terms comes up with a few references, but I've not used pfSense so have no idea on scaling / features / configuration / challenges etc.

In practice, USG's max out around 50-75Mbps once you enable a few basic features. This is fine for those of us stuck with low-grade residential network capacity (say, anyone in Australia), unfortunately. From here, having 10Gbps transit capability sounds like a nice problem to have.

If you're enterprise, then look at things like Riverbed Netshark / AppResponse, or Gigamon appliances - though a) these are expensive, especially once you go above 1Gbps , and b) if you're bursting 10Gbps, and your span is 10Gbps, you'll drop packets of course. That second is a problem you'll have with putting spans on your Unifi switches too, I suspect (8Gbps backplane, IIRC).

OTOH if you're using Elastiflow it's probably because you don't want to spend ridiculous $'s for a lesser-quality, proprietary packet monitoring / intercept tool.

Jesper80 commented 3 years ago

Its for a home lab environment, where I have only have 10G fiber on the LAN side, but therefore I can not put any non sfp+ netflow detection equipment in series with my pfSense. So I thought of putting something in parallel, since according to robcowart earlier in this post, pfSense does not support application ID detection.

Do you have any links to what you found in your google searches? Because the only pfsense related things I can find in google searches are stuff related to the snort package.

The site spreads over two locations with a VPN connection, with pfsense on both sides. The WAN connections are 1Gbit / 600 Mbit.

Maybe I could even put a docker or whatever to do the netflow detection instead of letting pfsense do it. There are a few on dockerhub, but I havn't been able to find anyone that says they do application ID.

Yeah, you kind of have a point that I don't feel like spending ridiculous amounts of money on proprietary enterprise gear ;)

jedd commented 3 years ago

I didn't dive into any links, just browsed quickly after searching for 'pfsense dpi application id'. I'm sure I'm finding the same stuff you are -- eg. the promise of "Layer 7 application detection" on https://www.netgate.com/solutions/pfsense/features.html , the integration of Netify.ai with pfSense ( https://www.netify.ai/get-netify/pfsense ) that has advertises its application detection capabilities - notably these https://www.netify.ai/resources/applications - but no idea on the underlying mechanisms, and whether it can be customised.

If you want to report on it, rather than act on it, then I'd say start with some logstash filters to tag data on ingest, and see what your performance is like. I suspect Rob's got some similar capability with the new custom (non-logstash) engine to do enrichment, but haven't spent much time with the unified flow collector yet.

rsf123 commented 3 years ago

I'm seeing a similar issue... I'm running Elastiflow in a containerized infrastructure. After a restart, my elastiflow container is no longer resolving DNS even though the environment variables were set in the docker-compose and the container does have them set int he environment. The container CAN reach the specified DNS server and is resolving correctly to it.

Jesper80 commented 3 years ago

Hi jedd and thanks for your reply

I didn't dive into any links, just browsed quickly after searching for 'pfsense dpi application id'. I'm sure I'm finding the same stuff you are -- eg. the promise of "Layer 7 application detection" on https://www.netgate.com/solutions/pfsense/features.html ,

While Layer 7 application detection seems to be a feature, and is indeed available for the Snort package, the snort package doesn't as far as I know have any capability to report netflow to an external server such as logstash.

the integration of Netify.ai with pfSense ( https://www.netify.ai/get-netify/pfsense ) that has advertises its application detection capabilities - notably these https://www.netify.ai/resources/applications - but no idea on the underlying mechanisms, and whether it can be customised.

I had a look at this Netify thing. It seems to be a cloud based all in one solution, to do monitoring. This to a cost on the basic package of 25 USD per host. Running two hosts, 50 USD per month quickly adds up to what it would cost for me to get some kind of external device to do the monitoring for me. The problem is just that I can not find such a thing that would be suitable for me.

I tried to set up the unified flow collector, but couldn't get that to work. It just wouldn't open up any ports on neither unraid nor ubuntu docker installation. But maybe this is the route to go. Do I understand it correctly that this would be able to do the netflow collection, that then would be sent to the Ubuntu VM with the ELK stack? And that would in that case support the application detection? Or is this an all in one docker to replace the whole ELK VM?

jedd commented 3 years ago

Okay, unfortunate, but good to know those are dead-end options. I'd anticipated having to dive into some custom pipeline stuff at some point -- we're kind of committed to the logstash approach for the time being, so I have only played briefly with the beta, and have no idea on how to enrich data there.

robcowart commented 3 years ago

It looks like the original issue was solved here. For questions about the new Unified Flow Collector, please join the community Slack. A like can be found here... https://www.elastiflow.com/get-started