perfsonar / owamp

A tool for performing one-way or two-way active measurements
Apache License 2.0
70 stars 30 forks source link

twamp toolchain & tests #16

Closed erik-geant closed 6 years ago

erik-geant commented 7 years ago

The patch submitted some time ago by Robert Shearman and Duncan Eastoe was essentially working and stable. This patch includes those changes and unit tests for both owamp and twamp.

vvidic commented 7 years ago

Created a new branch twamp that we can test.

vvidic commented 7 years ago

configure should check for openssl headers or the build fails:

gcc -DHAVE_CONFIG_H -I. -I../owamp -I..   -D_FORTIFY_SOURCE=2 -DOWP_PREFIX=/usr -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -c -o server.o server.c
server.c:24:25: fatal error: openssl/aes.h: No such file or directory
 #include <openssl/aes.h>
                         ^
vvidic commented 7 years ago

Some warnings and errors during build:

gcc -DHAVE_CONFIG_H -I. -I../owamp -I..   -D_FORTIFY_SOURCE=2 -DOWP_PREFIX=/usr -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -c -o server.o server.c
server.c: In function ‘encrypt_outgoing’:
server.c:187:29: warning: dereferencing ‘void *’ pointer
     memcpy(iv_bytes, &cipher[message_size-16], 16);
                             ^
server.c:187:22: warning: taking address of expression of type ‘void’
     memcpy(iv_bytes, &cipher[message_size-16], 16);
                      ^
server.c: In function ‘decrypt_incoming’:
server.c:217:28: warning: dereferencing ‘void *’ pointer
     memcpy(next_iv, &cipher[message_size-16], 16);
                            ^
server.c:217:21: warning: taking address of expression of type ‘void’
     memcpy(next_iv, &cipher[message_size-16], 16);
                     ^
gcc -DHAVE_CONFIG_H -I. -I../owamp -I..   -D_FORTIFY_SOURCE=2 -DOWP_PREFIX=/usr -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -c -o session_setup.o session_setup.c
session_setup.c: In function ‘session_setup_test’:
session_setup.c:155:5: error: ‘for’ loop initial declarations are only allowed in C99 or C11 mode
     for(int i=0; i<10 && !connected; i++) {
     ^
session_setup.c:155:5: note: use option -std=c99, -std=gnu99, -std=c11 or -std=gnu11 to compile your code
session_setup.c:208:13: error: redefinition of ‘i’
     for(int i=0; i<num_test_slots; i++) {
             ^
session_setup.c:155:13: note: previous definition of ‘i’ was here
     for(int i=0; i<10 && !connected; i++) {
             ^
session_setup.c:208:5: error: ‘for’ loop initial declarations are only allowed in C99 or C11 mode
     for(int i=0; i<num_test_slots; i++) {
     ^
Makefile:693: recipe for target 'session_setup.o' failed
vvidic commented 7 years ago

Some tests fail after the build on jessie:

make  check-TESTS
make[3]: Entering directory '/root/owamp/test'
make[4]: Entering directory '/root/owamp/test'
PASS: owtvec
PASS: owstats
FAIL: owping_enc
FAIL: owping_clear
FAIL: owe2e_clear
FAIL: owe2e_auth
FAIL: owe2e_enc
FAIL: twping_enc
FAIL: twping_clear
PASS: hmac
FAIL: twe2e_clear
FAIL: twe2e_auth
FAIL: twe2e_enc
make[5]: Entering directory '/root/owamp/test'
make[5]: Nothing to be done for 'all'.
make[5]: Leaving directory '/root/owamp/test'
============================================================================
Testsuite summary for owamp 3.5.4
============================================================================
# TOTAL: 13
# PASS:  3
# SKIP:  0
# XFAIL: 0
# FAIL:  10
# XPASS: 0
# ERROR: 0
============================================================================
See test/test-suite.log
Please report to owamp-bugs@internet2.edu
============================================================================

Perhaps caused by running as root?

FAIL: owping_enc
================

waiting for server: No such file or directory

FAIL: owping_clear
==================

waiting for server: No such file or directory

FAIL: owe2e_clear
=================

owampd[22715]: Running owampd as root is folly!
owampd[22715]: Use the -U option! (or allow root with the -f option)
owping: FILE=owping.c, LINE=1726, Unable to open control connection to localhost:47044.
available server port: 47044, 0xb7c4
config directory: 'owtest.vxzKEp'

FAIL: owe2e_auth
================

owampd[22723]: Running owampd as root is folly!
owampd[22723]: Use the -U option! (or allow root with the -f option)
owping: FILE=owping.c, LINE=1726, Unable to open control connection to localhost:55546.
available server port: 55546, 0xd8fa
config directory: 'owtest.IxHu4x'

FAIL: owe2e_enc
===============

owampd[22732]: Running owampd as root is folly!
owampd[22732]: Use the -U option! (or allow root with the -f option)
owping: FILE=owping.c, LINE=1726, Unable to open control connection to localhost:44711.
available server port: 44711, 0xaea7
config directory: 'owtest.jKF5EE'
vvidic commented 7 years ago

Running as normal user it looks a bit better:

make  check-TESTS
make[3]: Entering directory '/home/vvidic/owamp/test'
make[4]: Entering directory '/home/vvidic/owamp/test'
PASS: owtvec
PASS: owstats
FAIL: owping_enc
FAIL: owping_clear
PASS: owe2e_clear
PASS: owe2e_auth
PASS: owe2e_enc
FAIL: twping_enc
FAIL: twping_clear
PASS: hmac
PASS: twe2e_clear
PASS: twe2e_auth
PASS: twe2e_enc
make[5]: Entering directory '/home/vvidic/owamp/test'
make[5]: Nothing to be done for 'all'.
make[5]: Leaving directory '/home/vvidic/owamp/test'
============================================================================
Testsuite summary for owamp 3.5.4
============================================================================
# TOTAL: 13
# PASS:  9
# SKIP:  0
# XFAIL: 0
# FAIL:  4
# XPASS: 0
# ERROR: 0
============================================================================
See test/test-suite.log
Please report to owamp-bugs@internet2.edu
============================================================================

The remaining failures have the same error:

======================================
   owamp 3.5.4: test/test-suite.log
======================================

# TOTAL: 13
# PASS:  9
# SKIP:  0
# XFAIL: 0
# FAIL:  4
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: owping_enc
================

waiting for server: No such file or directory

FAIL: owping_clear
==================

waiting for server: No such file or directory

FAIL: twping_enc
================

waiting for server: No such file or directory

FAIL: twping_clear
==================

waiting for server: No such file or directory
vvidic commented 7 years ago

Tried to test with twamp-gui but it does not work in either direction: 100% packet loss. The TCP control connection setup seems to work, but there is something broken with the UDP test packets.

vvidic commented 7 years ago

I found twampd and twping binaries, but do we also need something like powstream for background measurements?

vvidic commented 7 years ago

Communicating with Cisco CSR1000v virtual router also shows some protocol problems:

$ ./owping/twping 192.168.122.104
Approximately 13.0 seconds until results available
twping: FILE=endpoint.c, LINE=3412, Unable to send([Router]:0:(#0): Invalid argument
twping: FILE=endpoint.c, LINE=3412, Unable to send([Router]:0:(#1): Invalid argument
twping: FILE=endpoint.c, LINE=3412, Unable to send([Router]:0:(#24): Invalid argument
twping: FILE=endpoint.c, LINE=3412, Unable to send([Router]:0:(#38): Invalid argument
twping: FILE=endpoint.c, LINE=3412, Unable to send([Router]:0:(#54): Invalid argument
twping: FILE=endpoint.c, LINE=3412, Unable to send([Router]:0:(#74): Invalid argument
twping: FILE=endpoint.c, LINE=3412, Unable to send([Router]:0:(#94): Invalid argument

--- twping statistics from [192.168.122.104]:9488 to [jessie-amd64]:0 ---
SID:    c0a87a68dd454001916872b03d398cbb
first:  2017-08-21T13:18:27.387
last:   2017-08-21T13:18:38.192
7 sent, 7 lost (100.000%), 0 send duplicates, 0 reflect duplicates
round-trip time min/median/max = nan/nan/nan ms, (err=3.68 ms)
send time min/median/max = nan/nan/nan ms, (err=3.68 ms)
reflect time min/median/max = nan/nan/nan ms, (err=3.68 ms)
reflector processing time min/max = nan/nan ms
two-way jitter = nan ms (P95-P50)
send jitter = nan ms (P95-P50)
reflect jitter = nan ms (P95-P50)
send TTL not reported
reflect TTL not reported

Testing the same router with twamp-gui works without errors.

deastoe commented 7 years ago

Hello,

I'm the original author of some of these patches. Thanks for picking these up and working at integrating them.

I've pushed some additional patches here: https://github.com/deastoe/owamp/commits/twamp-fixes. These should fix the inter-op issues with the Cisco CSR1000v and with the twamp-gui sender.

A quick test using twping towards the twamp-gui reflector points to an issue in twamp-gui. The reflected test packets had the error estimate fields set to all zeros, which twping detected as invalid and aborted the session.

vvidic commented 7 years ago

After adding twamp-fixes Cisco CSR1000v looks better:

$ ./owping/twping 192.168.122.104
Approximately 13.0 seconds until results available

Directional delays may be inaccurate due to out of sync clocks!

--- twping statistics from [192.168.122.104]:9752 to [jessie-amd64]:9752 ---
SID:    c0a87a68dd60d2dc38d4fdf3ad0f8b0a
first:  2017-09-11T11:16:13.392
last:   2017-09-11T11:16:24.512
100 sent, 0 lost (0.000%), 0 send duplicates, 0 reflect duplicates
round-trip time min/median/max = 1.25/1.9/4.21 ms, (err=4.29e+12 ms)
send time min/median/max = -0.424/0.2/0.671 ms, (err=2.15e+12 ms)
reflect time min/median/max = 1.56/1.7/4.02 ms, (err=2.15e+12 ms)
reflector processing time min/max = 0/0 ms
two-way jitter = 0.5 ms (P95-P50)
send jitter = 0.4 ms (P95-P50)
reflect jitter = 0.2 ms (P95-P50)
send hops = 0 (consistently)
reflect hops = 0 (consistently)
vvidic commented 7 years ago

Some tests are still failing when building as normal user, Erik can you check that?

FAIL: owping_enc
================

waiting for server: No such file or directory

FAIL: owping_clear
==================

waiting for server: No such file or directory

FAIL: twping_enc
================

waiting for server: No such file or directory

FAIL: twping_clear
==================

waiting for server: No such file or directory
erik-geant commented 7 years ago

Do these tests fail every time you run make check? So far I'm not able to reproduce this ... under which environment do you run the tests?

Incidentally, the "waiting for server: No such file or directory" messages are expected and are not an error. The unit test just tries a few times waiting for the simulated server to startup before starting the ping client, and for simplicity prints errstring until it connects. Those messages don't appear in the production code.

vvidic commented 7 years ago

On Mon, Sep 11, 2017 at 06:46:21AM -0700, Erik Reid wrote:

Do these tests fail every time you run make check? So far I'm not able to reproduce this ... under which environment do you run the tests?

Incidentally, the "waiting for server: No such file or directory" messages are expected and are not an error. The unit test just tries a few times waiting for the simulated server to startup before starting the ping client, and for simplicity prints errstring until it connects. Those messages don't appear in the production code.

I run the build under Debian jessie. The message "waiting for server" appears once but than the test exits with 1. Maybe the test could print out some more info so we can figure out what is failing?

-- Valentin Vidic Computer Systems Engineer - Expert Department of Computer Infrastructure and Services Croatian Academic and Research Network - CARNet Josipa Marohnica 5, HR-10000 Zagreb, Croatia tel: +385 1 6661 714, fax. +385 1 6661 635 gsm: +385 91 2480 919 www.CARNet.hr

erik-geant commented 7 years ago

I added a few progress messages in a temporary branch called twamp_test_debugging. Can you try this on your system?

I also noticed one branch where it could have been returning with an error and a proper message wasn't printed, so this might be where it's failing in your environment.

vvidic commented 7 years ago

Here is the output now:

FAIL: owping_enc
================

waiting for server: No such file or directory
connected to server ...
calling XWPControlOpen...
XWPControlOpen returned
initialized tspec struct, calling OWPSessionRequest
cleaning up, result: 0

FAIL: owping_clear
==================

waiting for server: No such file or directory
connected to server ...
calling XWPControlOpen...
XWPControlOpen returned
initialized tspec struct, calling OWPSessionRequest
cleaning up, result: 0

FAIL: twping_enc
================

waiting for server: No such file or directory
connected to server ...
calling XWPControlOpen...
XWPControlOpen returned
initialized tspec struct, calling OWPSessionRequest
cleaning up, result: 0

FAIL: twping_clear
==================

waiting for server: No such file or directory
connected to server ...
calling XWPControlOpen...
XWPControlOpen returned
initialized tspec struct, calling OWPSessionRequest
cleaning up, result: 0
erik-geant commented 7 years ago

maybe a timing problem ...?

I added a few more debug messages on that branch. Can you try again on your system?

vvidic commented 7 years ago

The tests are passing with this patches now, can you make the threads behave so we don't have this timing problems?

erik-geant commented 7 years ago

I took out most of those debugging messages to ensure I found the race condition and the additional printing isn't affecting the timing.

I put the change on the new branch: twamp_test_bugfix

Can you confirm the test pass on your system using this branch?

vvidic commented 7 years ago

On Mon, Sep 11, 2017 at 08:21:57AM -0700, Erik Reid wrote:

I took out most of those debugging messages to ensure I found the race condition and the additional printing isn't affecting the timing.

I put the change on the new branch: twamp_test_bugfix

Can you confirm the test pass on your system using this branch?

Yes, the tests seem to work now. Can you add this into the twamp branch as I have some compile errors that I would like to fix. BTW, what OS and compiler are you using for twamp?

-- Valentin Vidic Computer Systems Engineer - Expert Department of Computer Infrastructure and Services Croatian Academic and Research Network - CARNet Josipa Marohnica 5, HR-10000 Zagreb, Croatia tel: +385 1 6661 714, fax. +385 1 6661 635 gsm: +385 91 2480 919 www.CARNet.hr

erik-geant commented 7 years ago

merged that change onto twamp branch

the build/test environment i'm working on right now is ubuntu xenial64, gcc 5.4.0, automake 1.15

... I also see those warnings and will fix them today

erik-geant commented 7 years ago

I pushed some changes that fix a few of those warnings.

Which gcc -std option were used to generate those errors about loop variables? The gcc options pasted into the sample don't generate those in my current environment (which uses default -std value of gnu11). Requiring a very old standard like c90 then I can see those errors, as expected, but then the rest of the project can't be compiled.

vvidic commented 7 years ago

On Tue, Sep 12, 2017 at 04:30:28AM -0700, Erik Reid wrote:

I pushed some changes that fix a few of those warnings.

Which gcc -std option were used to generate those errors about loop variables? The gcc options pasted into the sample don't generate those in my current environment (which uses default -std value of gnu11). Requiring a very old standard like c90 then I can see those errors, as expected, but then the rest of the project can't be compiled.

Not sure why the errors are appearing on Debian jessie (gcc 4.9.2), but since the build was working without the twamp patch we need to keep it like that. I've pushed 2 commits to twamp branch to fix these problems. What remains are some warnings but they are comming from libI2util.a so we can't fix them here.

-- Valentin Vidic Computer Systems Engineer - Expert Department of Computer Infrastructure and Services Croatian Academic and Research Network - CARNet Josipa Marohnica 5, HR-10000 Zagreb, Croatia tel: +385 1 6661 714, fax. +385 1 6661 635 gsm: +385 91 2480 919 www.CARNet.hr

erik-geant commented 7 years ago

Ok, that makes sense now. The gcc default language standard was changed from gnu90 to gnu11 in gcc 5.1.0.

vvidic commented 7 years ago

On Wed, Sep 06, 2017 at 02:36:07PM +0000, deastoe wrote:

A quick test using twping towards the twamp-gui reflector points to an issue in twamp-gui. The reflected test packets had the error estimate fields set to all zeros, which twping detected as invalid and aborted the session.

I sent some patches to twamp-gui to fill the error estimate fields, so the direction twping -> twamp-responder should work now, please test.

The other way aroud (twamp-client -> twampd) still has problems. The problem seems to be in twampd this time because it does not recognize sender/receiver address fields set to 0 and just drops all incoming tests packets because of this:

The Sender Address and Receiver Address fields contain, respectively, the sender and receiver addresses of the endpoints of the Internet path over which a TWAMP-Test session is requested. They MAY be set to 0, in which case the IP addresses used for the Control-Client to Server TWAMP-Control message exchange MUST be used in the test packets.

So we need to make a patch that would take the addresses from the TCP control connection and use them for the UDP test packets if the 0-address was used during test setup.

-- Valentin Vidic Computer Systems Engineer - Expert Department of Computer Infrastructure and Services Croatian Academic and Research Network - CARNet Josipa Marohnica 5, HR-10000 Zagreb, Croatia tel: +385 1 6661 714, fax. +385 1 6661 635 gsm: +385 91 2480 919 www.CARNet.hr

vvidic commented 7 years ago

On Tue, Sep 12, 2017 at 08:11:52AM -0700, Erik Reid wrote:

Ok, that makes sense now. The gcc default language standard was changed from gnu90 to gnu11 in gcc 5.1.0.

Yes, the build works now on Debian wheezy and jessie, so we can try the packaging now. Can you patch the configure script to check for libssl-dev like we discussed?

-- Valentin Vidic Computer Systems Engineer - Expert Department of Computer Infrastructure and Services Croatian Academic and Research Network - CARNet Josipa Marohnica 5, HR-10000 Zagreb, Croatia tel: +385 1 6661 714, fax. +385 1 6661 635 gsm: +385 91 2480 919 www.CARNet.hr

deastoe commented 7 years ago

I sent some patches to twamp-gui to fill the error estimate fields, so the direction twping -> twamp-responder should work now, please test.

Thanks - tests now appear to complete:


Directional delays may be inaccurate due to out of sync clocks!

--- twping statistics from [10.10.1.2]: to [10.10.1.1]: ---
SID:    00000000000000000000000000000000
first:  2017-09-13T11:46:28.306
last:   2017-09-13T11:46:37.988
100 sent, 0 lost (0.000%), 0 send duplicates, 0 reflect duplicates
round-trip time min/median/max = 0.389/1.2/19.8 ms, (err=0.0322 ms)
send time min/median/max = 1.86e+03/1.86e+03/1.88e+03 ms, (err=0.0161 ms)
reflect time min/median/max = -1.86e+03/-1.86e+03/-1.86e+03 ms, (err=0.0161 ms)
reflector processing time min/max = 0/1 ms
two-way jitter = 4.5 ms (P95-P50)
send jitter = 3.1 ms (P95-P50)
reflect jitter = 3.5 ms (P95-P50)
send hops = 255 (consistently)
reflect hops = 191 (consistently)

Looks like twamp-responder is not generating/returning a session ID, and also not setting the reflected test packet TTL values as per the RFC:

  • Extract the Sender TTL value from the TTL/Hop Limit value of received packets. Session-Reflector implementations SHOULD fetch the TTL/Hop Limit value from the IP header of the packet, replacing the value of 255 set by the Session-Sender. If an implementation does not fetch the actual TTL value (the only good reason not to do so is an inability to access the TTL field of arriving packets), it MUST set the Sender TTL value as 255.

So we need to make a patch that would take the addresses from the TCP control connection and use them for the UDP test packets if the 0-address was used during test setup.

I have put out a fix for this in PR https://github.com/perfsonar/owamp/pull/21.

vvidic commented 7 years ago

On Wed, Sep 13, 2017 at 11:57:38AM +0000, Duncan Eastoe wrote:

Thanks - tests now appear to complete:


Directional delays may be inaccurate due to out of sync clocks!

--- twping statistics from [10.10.1.2]: to [10.10.1.1]: ---
SID:    00000000000000000000000000000000
first:  2017-09-13T11:46:28.306
last:   2017-09-13T11:46:37.988
100 sent, 0 lost (0.000%), 0 send duplicates, 0 reflect duplicates
round-trip time min/median/max = 0.389/1.2/19.8 ms, (err=0.0322 ms)
send time min/median/max = 1.86e+03/1.86e+03/1.88e+03 ms, (err=0.0161 ms)
reflect time min/median/max = -1.86e+03/-1.86e+03/-1.86e+03 ms, (err=0.0161 ms)
reflector processing time min/max = 0/1 ms
two-way jitter = 4.5 ms (P95-P50)
send jitter = 3.1 ms (P95-P50)
reflect jitter = 3.5 ms (P95-P50)
send hops = 255 (consistently)
reflect hops = 191 (consistently)

Looks like twamp-responder is not generating/returning a session ID, and also not setting the reflected test packet TTL values as per the RFC:

Sent them a patch to return TTL of 255 since the real TTL value is not available from the QUdpSocket class. Is there something wrong with the timestamps too since this message appears:

Directional delays may be inaccurate due to out of sync clocks!

So we need to make a patch that would take the addresses from the TCP control connection and use them for the UDP test packets if the 0-address was used during test setup.

I have put out a fix for this in PR https://github.com/perfsonar/owamp/pull/21.

Just merged this, thanks again :)

-- Valentin Vidic Computer Systems Engineer - Expert Department of Computer Infrastructure and Services Croatian Academic and Research Network - CARNet Josipa Marohnica 5, HR-10000 Zagreb, Croatia tel: +385 1 6661 714, fax. +385 1 6661 635 gsm: +385 91 2480 919 www.CARNet.hr

vvidic commented 7 years ago

Do we want to use a new port range for TWAMP UDP tests or reuse the range from OWAMP?

deastoe commented 7 years ago

Sent them a patch to return TTL of 255 since the real TTL value is not available from the QUdpSocket class.

Thank you.

Is there something wrong with the timestamps too since this message appears:

This message appears when any calculated directional delay is negative. This can occur for a number of reasons such as unsynced clocks, or when the latency between the sender and reflector is so small that the difference between the clocks of the sender and reflector becomes significant.

Therefore you can see this message when using twampd as the reflector and is not necessarily indicative of a problem with twamp-gui.

vvidic commented 6 years ago

On Mon, Sep 11, 2017 at 02:14:14PM +0000, Erik Reid wrote:

I added a few progress messages in a temporary branch called twamp_test_debugging. Can you try this on your system?

I also noticed one branch where it could have been returning with an error and a proper message wasn't printed, so this might be where it's failing in your environment.

One of the tests is failing on i386:

$ test/owstats owstats: FILE=stats.c, LINE=1634, OWPStatsParse: Invalid sample range [4294967295,109] OWPStatsParse failed

Can you check if this can be fixed?

-- Valentin Vidic Computer Systems Engineer - Expert Department of Computer Infrastructure and Services Croatian Academic and Research Network - CARNet Josipa Marohnica 5, HR-10000 Zagreb, Croatia tel: +385 1 6661 714, fax. +385 1 6661 635 gsm: +385 91 2480 919 www.CARNet.hr

vvidic commented 6 years ago

If the test mesh is in a good state we could merge this to see if any problems with owamp appear?