SIDN / spin

SPIN Core Software
https://spin.sidnlabs.nl
GNU General Public License v2.0
77 stars 9 forks source link

Monitoring multiple VLAN's at the same time with spin-pcap-reader #88

Closed frankvandenhurk closed 2 years ago

frankvandenhurk commented 2 years ago

It looks like I can only have 3 spin-pcap-reader instances active at the same time. Since I think it is a best practice to give every IoT vendor its own VLAN so I can regulate traffic in my firewall more easily, I have 19 IoT VLAN's that I would like to monitor.

Would it be possible to have a single spin-pcap-reader listen on multiple interfaces? Or let it strip VLAN tags correctly so I can have it listen on a trunk interface? Or make it possible to run more instances simultaneously?

(NB: when I feed spin-pcap-reader an interface with tagged traffic, I get: "spin-pcap-reader: caplen 1514 != len 1518,")

cschutijser commented 2 years ago

Hi,

It looks like I can only have 3 spin-pcap-reader instances active at the same time. Since I think it is a best practice to give every IoT vendor its own VLAN so I can regulate traffic in my firewall more easily, I have 19 IoT VLAN's that I would like to monitor.

Would it be possible to have a single spin-pcap-reader listen on multiple interfaces?

It's not very straightforward to modify spin-pcap-reader to listen to multiple interfaces. I don't think that's the way to go.

Or let it strip VLAN tags correctly so I can have it listen on a trunk interface?

That should be possible and is the best way to solve it, I think. I'll have a look at that (or you can have a look, if you want; the switch statement in callback() in pcap.c needs to check for ETHERTYPE_VLAN and perhaps ETHERTYPE_QINQ as well and handle it appropriately). That may take some time since I'd need to test my changes properly.

Or make it possible to run more instances simultaneously?

The fact that you can only run 3 instances is (I think) due to the limit imposed by the MAXMNR define in mainloop.c. Can you increase that number (by 15 or 20), recompile spind and see whether that changes things? If so, then perhaps we should bump that number. Or would it be sufficient for you if VLAN tag stripping is supported? I'd need to think a little bit longer to see if increasing that number also has downsides.

(NB: when I feed spin-pcap-reader an interface with tagged traffic, I get: "spin-pcap-reader: caplen 1514 != len 1518,")

I should make spin-pcap-reader a little less noisy. But for now you can avoid that by changing 1514 to 1518 in pcap.c.

frankvandenhurk commented 2 years ago

I'll test with a higher MAXMNR. I also think interpreting the VLAN info is the most robust way to use this. Unfortunately I'm not a programmer, so I won't be able to implement this myself.

frankvandenhurk commented 2 years ago

I now have:

src/spind/mainloop.c:#define MAXMNR 50

and the spin-pcap-reader's have been running quite stable for the last hour. Only a few of the spin-pcap-reader instances died, but I have some scripting that automatically restarts them,

(more hick-ups earlier this evening, can't explain why)

I do have observed spind hanging one time. From the spind logs:

wf_extsrc: recv: Success
closing core2extsrc fd
Error on fd 12 from external-source-tcp
wf_extsrc: recv: Bad file descriptor
closing core2extsrc fd
frankvandenhurk commented 2 years ago

I now have high frequent repeating:

closing core2extsrc fd
wf_extsrc: recv: Bad file descriptor
cschutijser commented 2 years ago

I implemented support for VLANs in commit f5945380f053f9644e89cbbb882146e16e1cb10a. Can you check that out and see how it goes? That should at least remove the need to start a separate spin-pcap-reader instance for every VLAN.

Regarding your comment about spin-pcap-reader printing "spin-pcap-reader: caplen 1514 != len 1518,": for now, you'll need to start spin-pcap-reader with the new -s flag, e.g. with -s 1518. I've documented that in doc/user/pcap_reader.md. I'm still considering whether to increase the limit from 1514 to 1518 by default.

I still need to think a bit more about the other problems that you posted about in this thread (the "Bad file descriptor" messages), but right now I think it's quite possible that the situation in that area has been improved with commit 8534d41bc7e1d4122abe307c7549502b087d090b for issue #89. Can you report if there's still some problems in that area in a new issue?

Regarding this comment from you earlier:

Only a few of the spin-pcap-reader instances died, but I have some scripting that automatically restarts them,

(more hick-ups earlier this evening, can't explain why)

I would be interested in the (last part of the) log from those instances because ideally spin-pcap-reader doesn't just die. On the other hand, the way I programmed it, it quickly exits if it encounters any problems. Perhaps I should revise some of that. Anyway, if you have any interesting log snippets, can you create a new issue with the log snippets and a short description of the circumstances?

frankvandenhurk commented 2 years ago

Helaas:

make[3]: Entering directory '/home/pi/spin/src/build/tools/spin-pcap-reader' CC spin_pcap_reader-arpupdate.o CC spin_pcap_reader-ipt.o CC spin_pcap_reader-macstr.o CC spin_pcap_reader-pcap.o ../../../tools/spin-pcap-reader/pcap.c: In function ‘callback’: ../../../tools/spin-pcap-reader/pcap.c:492:7: error: ‘ETHERTYPE_QINQ’ undeclared (first use in this function); did you mean ‘ETHERTYPE_IP’? 492 | case ETHERTYPE_QINQ: | ^~~~~~ | ETHERTYPE_IP ../../../tools/spin-pcap-reader/pcap.c:492:7: note: each undeclared identifier is reported only once for each function it appears in make[3]: [Makefile:447: spin_pcap_reader-pcap.o] Error 1 make[3]: Leaving directory '/home/pi/spin/src/build/tools/spin-pcap-reader' make[2]: [Makefile:318: all-recursive] Error 1 make[2]: Leaving directory '/home/pi/spin/src/build/tools' make[1]: [Makefile:434: all-recursive] Error 1 make[1]: Leaving directory '/home/pi/spin/src/build' make: [Makefile:340: all] Error 2

frankvandenhurk commented 2 years ago

Regarding this comment from you earlier:

Only a few of the spin-pcap-reader instances died, but I have some scripting that automatically restarts them, (more hick-ups earlier this evening, can't explain why)

I would be interested in the (last part of the) log from those instances because ideally spin-pcap-reader doesn't just die. On the other hand, the way I programmed it, it quickly exits if it encounters any problems. Perhaps I should revise some of that. Anyway, if you have any interesting log snippets, can you create a new issue with the log snippets and a short description of the circumstances?

There is no logging, it just dies. If you can build a debugging modus I would be more than happy to collect some data for you. Only logging I get from spin-pcap-reader is info about truncated packets and sometimes this (but that's not when it dies):

00: 02 04 05 b4 00: 00: 02 04 05 b4 00: 00: 02 04 05 b4 00:

cschutijser commented 2 years ago

Helaas:

make[3]: Entering directory '/home/pi/spin/src/build/tools/spin-pcap-reader' CC spin_pcap_reader-arpupdate.o CC spin_pcap_reader-ipt.o CC spin_pcap_reader-macstr.o CC spin_pcap_reader-pcap.o ../../../tools/spin-pcap-reader/pcap.c: In function ‘callback’: ../../../tools/spin-pcap-reader/pcap.c:492:7: error: ‘ETHERTYPE_QINQ’ undeclared (first use in this function); did you mean ‘ETHERTYPE_IP’? 492 | case ETHERTYPE_QINQ: | ^~~~~~ | ETHERTYPE_IP ../../../tools/spin-pcap-reader/pcap.c:492:7: note: each undeclared identifier is reported only once for each function it appears in make[3]: [Makefile:447: spin_pcap_reader-pcap.o] Error 1 make[3]: Leaving directory '/home/pi/spin/src/build/tools/spin-pcap-reader' make[2]: [Makefile:318: all-recursive] Error 1 make[2]: Leaving directory '/home/pi/spin/src/build/tools' make[1]: [Makefile:434: all-recursive] Error 1 make[1]: Leaving directory '/home/pi/spin/src/build' make: [Makefile:340: all] Error 2

Apologies, that was sloppiness on my side. Should be fixed now with commit 174a3feeca63198bbbf67d10f518bd2cb68a0c48. Can you try again?

There is no logging, it just dies. If you can build a debugging modus I would be more than happy to collect some data for you.

Interesting. I'll have a look at that later (not sure if that'll be this week).

Only logging I get from spin-pcap-reader is info about truncated packets and sometimes this (but that's not when it dies):

00: 02 04 05 b4 00: 00: 02 04 05 b4 00: 00: 02 04 05 b4 00:

If I remember correctly, that's output from some ldns functions that we use.

frankvandenhurk commented 2 years ago

Apologies, that was sloppiness on my side. Should be fixed now with commit 174a3fe. Can you try again?

No problem. That works! I can now monitor an interface with tagged traffic and see all traffic in the GUI.

frankvandenhurk commented 2 years ago

Little problem: if I include interfaces with regular users, the GUI becomes unresponsive within 30 seconds. Would it be possible to filter (positive of negative) what VLAN's will be reported to spind?

frankvandenhurk commented 2 years ago

Scratch that last comment, ignoring the hosts in the GUI would probably work, I'll check tomorrow

cschutijser commented 2 years ago

Good to hear that monitoring interfaces with tagged traffic now works! If that's OK with you, I'll close this ticket then.

Scratch that last comment, ignoring the hosts in the GUI would probably work, I'll check tomorrow

That's indeed one possibility. Another possibility is that you use spin-pcap-reader's -f flag. I didn't document that yet, which I probably should do. But anyway, using the -f flag, you can specify the PCAP filter that's passed to the PCAP library. The syntax for this filter is documented in the pcap-filter manual page (web version). Check for mentions of "VLAN". Not sure if this filter language is flexible enough for your use case though. If filtering for VLANs does not work, you can perhaps filter based on IP addresses. Make sure you surround your PCAP filter string with quotes when you pass it with -f.

frankvandenhurk commented 2 years ago

Tnx, that works like a charm! ( -f '! vlan xxx' )