ValentinBELYN / icmplib

Easily forge ICMP packets and make your own ping and traceroute.
GNU Lesser General Public License v3.0
267 stars 45 forks source link

Feature request: DSCP/TOS Flags #11

Closed onkelbeh closed 3 years ago

onkelbeh commented 3 years ago

Would be a great feature to have the possibilty to set DSCP/TOS bits for ICMP and traceroute. Thanks in advance.

ValentinBELYN commented 3 years ago

Hi @onkelbeh,

ICMP must use the value 0 for the Type of Service field according to the RFC 792. https://tools.ietf.org/pdf/rfc792.pdf (page 2).

I'm not sure it's a good idea to modify this field. In addition, routers manage themselves the priority of ICMP packets. Therefore, this field is certainly ignored.

onkelbeh commented 3 years ago

Hi Valentin,

oh, thats not entirely correct. You mention RFC 792, which was written 1981, a long time before RFC 2474.

Best Overview: https://en.wikipedia.org/wiki/Type_of_service, take a closer look at reference 1. Historical: https://tools.ietf.org/html/rfc3168#section-22

Only bit 0 has always to be zero :-)

ip utils ping and traceroute can set them (ping -Q or traceroute -t)

Having a larger VPN setup with lots of internet connections, I have to use DSCP and TOS for path selection. Our monitoring system, for example, sets af11 for forcing a packet to go over the 'thicker' line, and af21 for the fastest link. So I can check the status of more links from a central point.

For user traffic, we currently use 6 different classes for our internal traffic, eg. Voice, Video, Transactional, Interactive, Bulk and Scavenger.

Thats only 6 at the moment because we have now enough bandwidth (our Bosses have been generous during the virus). But this will change again. Our routers for example will honor 2 bits of DSCP for adjusting the drop propability of packets on overloaded links.

And, since the virus, we have a lots of ef traffic with reserved bandwidth for VoIP and telphony. The routers self use CS6 for management traffic.

This is the best table I found: qostable-1

Many of our management scripts are written in Python, and after I discovered your library through the Home Assistant Component, I hoped I could use and improve it to fit our needs, it's lightweighted and easy to integrate. I'm not sure how complex this is in python to correctly call setsockopt.

I played around a bit with

socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_ICMP)
socket.setsockopt(socket.IPPROTO_IP, socket.IP_TOS, 184)

to mark them with DSCP EF, for example. This seems not to work on 3.7.9, the packets have not beed marked. If I do a ping -Q 184, the packets are marked correctly.

Anyway, there must be a way in Python, because Scapy can do it. I'd like to avoid using Scapy as a library for it, because this thing is the overkill and has a complex API. And, for regular checks, it's way too big to be loaded several times a second. If you have 70 sites and an average of 4 paths then I have to run ~4 checks/second to get one measurement per path and minute.

Perhaps you can take a look at it. I would be glad to help / test ... Yes, I am aware that not many will need these bits. But I do, so I just asked :-) Thanks for your time.

ValentinBELYN commented 3 years ago

Thank you for taking the time to cite use cases, information about existing implementations, and QoS details 👍 It's really interesting. I will try to implement this feature.

Otherwise, Scapy doesn't use socket.setsockopt to set the TOS field. It manages the entire IP header itself and adds this field "manually".

As for icmplib, it only manages the ICMP header.

ValentinBELYN commented 3 years ago

I have successfully set the TOS field on Linux and macOS. However, I see that it doesn't work on Windows.

From what I've read, Windows now ignores this field:

On Windows 2000, Windows XP, and Windows Server 2003, the ToS bits marking functionality in Winsock applications and the ping utility is disabled by default. The attempt to set the IP_TOS option with the setsockopt function on these versions of Microsoft Windows still returns 0 (SUCCESS) to allow applications to continue to run; but the ToS bits in the IP header is not marked.

It is now necessary to use the Quality of Service API, which is not possible in Python. The alternative solution would be to manipulate the full IP header, which is overkill for current needs.

Last solution and I may prefer this one, provide this feature only for Unix systems. What do you think?

onkelbeh commented 3 years ago

For me it would be OK if it's not working on Windows. 🥇 I'm the Python on Linux guy :-) And the box I would want to do the checks also is a Linux box.

And yes, Windows is not designed to forward packets. I know about certain applications which (can) mark outbound packets (some GPS tools, Citrix) on Windows with success, but removing the ability to mark packets with ping and tracert on windows (and just ignoring the flag without message) can be a pitfall.

Thanks for thinking about it.

ValentinBELYN commented 3 years ago

OK! I love Linux too 😄 I will try to implement this feature for the next version of icmplib (1.2). It will normally be released at the end of the week.

Do you have any idea of the name we could give to the property of the ICMPRequest class (also used for ping and traceroute functions)? I was thinking of priority to encompass both the terms TOS / DS Field and Traffic Class.

Examples:

# Ping
ping('1.1.1.1', priority=184)

# Traceroute
traceroute('1.1.1.1', priority=184)

# ICMPRequest (low level)
ICMPRequest('1.1.1.1', id=1000, sequence=1, priority=184)
onkelbeh commented 3 years ago

Hi,

yeah, just a name. In ping it's called -Q, traceroute uses -t, fping uses --tos. 'priority' sounds good and will imho not interfere with anything else.

Bildschirmfoto 2020-09-10 um 07 16 28

https://xkcd.com/927/

ValentinBELYN commented 3 years ago

So true :sweat_smile: ... but I finally thought long and hard and the name of the property will be traffic_class! Given that the format of the DS Field was taken from the Traffic Class field in IPv6, this seems more logical to me.

icmplib 1.2, including this feature, will be released tomorrow with more features and fixes. I am currently doing the last tests.

ValentinBELYN commented 3 years ago

icmplib 1.2 is available! You can now define the traffic_class parameter of the ping, multiping and traceroute functions (traffic_class is also available on low level classes).

pip3 install --upgrade icmplib

I've tested on several platforms and it works, except on Windows of course (the value is ignored).

onkelbeh commented 3 years ago

Yeah, thats cool. Thank you very much. Added it to the repo: https://github.com/onkelbeh/HomeAssistantRepository/commit/6559e8eba3337bee31c95e0f1e7019e246887293, seems I had a typo in the LICENSE string, also fixed that. Will check it out tomorrow.

ValentinBELYN commented 3 years ago

Did you have time to test?

onkelbeh commented 3 years ago

Hi,

Have a very busy time, still a lot are on vacation, and some fool killed one of the backup servers last week, so it didn't make it in my testtool, yet. But this is exactly what I wanted. Thanks a lot. Works like a charm.

Also, traffic_class was a very good choice.

Currently thinking about a way to use your multiping feature, this could avoid a lot of load/import time. There are about 80 remote sites to check, perhaps I can put these all in one big job.

In the example below you can see the different loop time for AF21 traffic, this path is nearly emtpy now (Saturday), but the bulk traffic passes a german DSL line between hop 3 and 4, which has a slightly higher latency:

root@lnx-monitoring:/usr/lib/python3.7/site-packages/icmplib # traceroute -I 192.168.96.65 -t 70
traceroute to 192.168.96.65 (192.168.96.65), 30 hops max, 60 byte packets
 1  vt3.dgn.in.streicher.de (192.168.57.97)  1.952 ms  2.018 ms  2.270 ms
 2  isr-dgn1.router.in.streicher.de (192.168.95.36)  1.001 ms  1.239 ms  1.348 ms
 3  isr-deg01.router.in.streicher.de (192.168.249.1)  25.213 ms  25.460 ms  25.666 ms
 4  isr-jena01.router.in.streicher.de (192.168.249.97)  68.089 ms  68.404 ms  68.605 ms
 5  vt1-1.jena.in.streicher.de (192.168.96.65)  101.025 ms  101.298 ms  101.527 ms
root@lnx-monitoring:/usr/lib/python3.7/site-packages/icmplib # traceroute -I 192.168.96.65 -t 72
traceroute to 192.168.96.65 (192.168.96.65), 30 hops max, 60 byte packets
 1  vt3.dgn.in.streicher.de (192.168.57.97)  1.777 ms  1.902 ms  2.154 ms
 2  isr-dgn1.router.in.streicher.de (192.168.95.36)  0.647 ms  0.944 ms  1.053 ms
 3  isr-dgn02.router.in.streicher.de (192.168.248.65)  1.379 ms  1.562 ms  1.687 ms
 4  isr-jena02.router.in.streicher.de (192.168.250.98)  17.480 ms  18.101 ms  18.361 ms
 5  vt1-1.jena.in.streicher.de (192.168.96.65)  35.467 ms  35.666 ms  35.803 ms
root@lnx-monitoring:/usr/lib/python3.7/site-packages/icmplib # ping 192.168.96.65 -Q 70 -c3
PING 192.168.96.65 (192.168.96.65) 56(84) bytes of data.
64 bytes from 192.168.96.65: icmp_seq=1 ttl=251 time=79.9 ms
64 bytes from 192.168.96.65: icmp_seq=2 ttl=251 time=73.0 ms
64 bytes from 192.168.96.65: icmp_seq=3 ttl=251 time=84.9 ms

--- 192.168.96.65 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
rtt min/avg/max/mdev = 73.046/79.276/84.874/4.849 ms
root@lnx-monitoring:/usr/lib/python3.7/site-packages/icmplib # ping 192.168.96.65 -Q 72 -c3
PING 192.168.96.65 (192.168.96.65) 56(84) bytes of data.
64 bytes from 192.168.96.65: icmp_seq=1 ttl=251 time=33.8 ms
64 bytes from 192.168.96.65: icmp_seq=2 ttl=251 time=48.3 ms
64 bytes from 192.168.96.65: icmp_seq=3 ttl=251 time=29.3 ms

--- 192.168.96.65 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2018ms
rtt min/avg/max/mdev = 29.349/37.176/48.346/8.107 ms
root@lnx-monitoring:/usr/lib/python3.7/site-packages/icmplib # python
Python 3.7.8 (default, Sep  2 2020, 17:07:08)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from icmplib import ping, traceroute, Host, Hop
>>> host = ping('192.168.96.65', count=3, interval=0.1, traffic_class=70)
>>> print (host.avg_rtt)
75.801
>>> host = ping('192.168.96.65', count=3, interval=0.1, traffic_class=72)
>>> print (host.avg_rtt)
29.756
>>> hops = traceroute('192.168.96.65', count=3, interval=0.1, traffic_class=70)
>>> hops2 = traceroute('192.168.96.65', count=3, interval=0.1, traffic_class=72)
>>> print (f"{hops}\n{hops2}")
[<Hop 1 [192.168.57.97]>, <Hop 2 [192.168.95.36]>, <Hop 3 [192.168.249.1]>, <Hop 4 [192.168.249.97]>, <Hop 5 [192.168.96.65]>]
[<Hop 1 [192.168.57.97]>, <Hop 2 [192.168.95.36]>, <Hop 3 [192.168.248.65]>, <Hop 4 [192.168.250.98]>, <Hop 5 [192.168.96.65]>]
>>>
ValentinBELYN commented 3 years ago

I'm glad to hear that everything is working fine. 👍 However, what do you mean by "load/import time"? Otherwise using multiping is a good idea. Moreover, this function will be much faster in v2.

onkelbeh commented 3 years ago

Ah, sorry, did not explain: My goal is to set up a steady measurement to >70 sites from 2 HQ sites, with 2 paths to check per minute will result in 70 x 2 x 2 / 60 aka ~5 checks per second. I just want to avoid to load Python and the required libraries for every single check.

ValentinBELYN commented 3 years ago

OK I better understand 😃 If you have other requests afterwards do not hesitate to open new issues.