perfsonar / pscheduler

The perfSONAR Scheduler
Apache License 2.0
55 stars 33 forks source link

Using RTT deadline reports 100% packet loss #1423

Closed szymontrocha closed 6 months ago

szymontrocha commented 6 months ago

Reported from user list.

I also observed it (tested under Ubuntu20, perfSONAR 5.0.8)

$ pscheduler task rtt -d 150.254.173.3 --deadline PT20S
Submitting task...
Task URL:
https://psmall-poz1.man.poznan.pl/pscheduler/tasks/208e7d71-139c-4a89-9fb1-3bf74c7b04d6
Running with tool 'ping'
Fetching first run...

Next scheduled run:
https://psmall-poz1.man.poznan.pl/pscheduler/tasks/208e7d71-139c-4a89-9fb1-3bf74c7b04d6/runs/49a9d248-7f13-413e-98b7-37c92fff77c4
Starts 2024-03-21T08:55:05+00:00 (~0 seconds)
Ends   2024-03-21T08:55:16+00:00 (~10 seconds)
Waiting for result...

100% Packet Loss  RTT Min/Mean/Max/StdDev = Unknown/Unknown/Unknown/Unknown ms

No further runs scheduled.
$ pscheduler task rtt -d 150.254.173.3
Submitting task...
Task URL:
https://psmall-poz1.man.poznan.pl/pscheduler/tasks/47c935ac-1146-4364-942d-1fccef6ef12a
Running with tool 'ping'
Fetching first run...

Next scheduled run:
https://psmall-poz1.man.poznan.pl/pscheduler/tasks/47c935ac-1146-4364-942d-1fccef6ef12a/runs/537009a8-380a-4c7d-afc0-c79d582cd148
Starts 2024-03-21T08:56:00+00:00 (~1 seconds)
Ends   2024-03-21T08:56:11+00:00 (~10 seconds)
Waiting for result...

1       rose.man.poznan.pl (150.254.173.3)  64 Bytes  TTL 60  RTT   2.4900 ms
2       rose.man.poznan.pl (150.254.173.3)  64 Bytes  TTL 60  RTT   2.5200 ms
3       rose.man.poznan.pl (150.254.173.3)  64 Bytes  TTL 60  RTT   2.1800 ms
4       rose.man.poznan.pl (150.254.173.3)  64 Bytes  TTL 60  RTT   2.5000 ms
5       rose.man.poznan.pl (150.254.173.3)  64 Bytes  TTL 60  RTT   2.3000 ms

0% Packet Loss  RTT Min/Mean/Max/StdDev = 2.181000/2.397000/2.524000/0.133000 ms

No further runs scheduled.
$ pscheduler result --diags https://psmall-poz1.man.poznan.pl/pscheduler/tasks/208e7d71-139c-4a89-9fb1-3bf74c7b04d6
2024-03-21T08:55:05+00:00 on psmall-poz1.man.poznan.pl with ping:

https://psmall-poz1.man.poznan.pl/pscheduler/tasks/208e7d71-139c-4a89-9fb1-3bf74c7b04d6/runs/49a9d248-7f13-413e-98b7-37c92fff77c4

rtt --dest 150.254.173.3 --deadline PT20S

100% Packet Loss  RTT Min/Mean/Max/StdDev = Unknown/Unknown/Unknown/Unknown ms

Limit system diagnostics for this run:
  Hints:
    requester: 2001:808:2:3004:2::2
    server: 2001:808:2:3004:2::2
  Identified as everybody, local-interfaces
  Classified as default, friendlies
  Application: Hosts we trust to do everything
    Group 1: Limit 'always' passed
    Group 1: Want all, 1/1 passed, 0/1 failed: PASS
    Application PASSES
  Passed one application.  Stopping.
  Proposal meets limits
  Priority set at 0:
    Initial priority  (Set to 0)

Diagnostics from psmall-poz1.man.poznan.pl:
  ping -n -c 5 -i 1.0 -w 20.0 -W 1.0 150.254.173.3

$ pscheduler result --diags https://psmall-poz1.man.poznan.pl/pscheduler/tasks/47c935ac-1146-4364-942d-1fccef6ef12a
2024-03-21T08:56:00+00:00 on psmall-poz1.man.poznan.pl with ping:

https://psmall-poz1.man.poznan.pl/pscheduler/tasks/47c935ac-1146-4364-942d-1fccef6ef12a/runs/537009a8-380a-4c7d-afc0-c79d582cd148

rtt --dest 150.254.173.3

1       rose.man.poznan.pl (150.254.173.3)  64 Bytes  TTL 60  RTT   2.4900 ms
2       rose.man.poznan.pl (150.254.173.3)  64 Bytes  TTL 60  RTT   2.5200 ms
3       rose.man.poznan.pl (150.254.173.3)  64 Bytes  TTL 60  RTT   2.1800 ms
4       rose.man.poznan.pl (150.254.173.3)  64 Bytes  TTL 60  RTT   2.5000 ms
5       rose.man.poznan.pl (150.254.173.3)  64 Bytes  TTL 60  RTT   2.3000 ms

0% Packet Loss  RTT Min/Mean/Max/StdDev = 2.181000/2.397000/2.524000/0.133000 ms

Limit system diagnostics for this run:
  Hints:
    requester: 2001:808:2:3004:2::2
    server: 2001:808:2:3004:2::2
  Identified as everybody, local-interfaces
  Classified as default, friendlies
  Application: Hosts we trust to do everything
    Group 1: Limit 'always' passed
    Group 1: Want all, 1/1 passed, 0/1 failed: PASS
    Application PASSES
  Passed one application.  Stopping.
  Proposal meets limits
  Priority set at 0:
    Initial priority  (Set to 0)

Diagnostics from psmall-poz1.man.poznan.pl:
  ping -n -c 5 -i 1.0 -W 1.0 150.254.173.3

  PING 150.254.173.3 (150.254.173.3) 56(84) bytes of data.
  64 bytes from 150.254.173.3: icmp_seq=1 ttl=60 time=2.49 ms
  64 bytes from 150.254.173.3: icmp_seq=2 ttl=60 time=2.52 ms
  64 bytes from 150.254.173.3: icmp_seq=3 ttl=60 time=2.18 ms
  64 bytes from 150.254.173.3: icmp_seq=4 ttl=60 time=2.50 ms
  64 bytes from 150.254.173.3: icmp_seq=5 ttl=60 time=2.30 ms

  --- 150.254.173.3 ping statistics ---
  5 packets transmitted, 5 received, 0% packet loss, time 4007ms
  rtt min/avg/max/mdev = 2.181/2.397/2.524/0.133 ms

$
mfeit-internet2 commented 6 months ago

Later versions of ping don't accept floating-point numbers as deadlines.

Need to fix that and make the ping plugin's run method be able to distinguish between an exit code of 1 because the remote host didn't answer and a real error like this one.