perfsonar / mesh-config

Centralized configuration framework for measurement points and GUIs
Apache License 2.0
2 stars 0 forks source link

meshconfig => pscheduler tasks submission problem #64

Closed igarny closed 7 years ago

igarny commented 7 years ago

Hi Andy,

with this "-tasks.conf" https://test-rhps02.geant.net/meshconfig-agent-tasks.conf meshconfig gives these errors

[dfn.garnizov@test-rhps02 ~]$ tail -n 10  /var/log/perfsonar/meshconfig-agent.log
2017/05/05 07:45:56 (64069) WARN> perfsonar_meshconfig_agent:430 main:: - Problem determining which pscheduler to submit test to for creation, skipping test: 400 BAD REQUEST: Can't find pScheduler or BWCTL on psps-test2-mgmt.rrze.uni-erlangen.de

2017/05/05 08:06:00 (356) WARN> perfsonar_meshconfig_agent:430 main:: - Problem determining which pscheduler to submit test to for creation, skipping test: 400 BAD REQUEST: Can't find pScheduler or BWCTL on psps-test2-mgmt.rrze.uni-erlangen.de

2017/05/05 08:06:00 (356) WARN> perfsonar_meshconfig_agent:430 main:: - Problem determining which pscheduler to submit test to for creation, skipping test: 400 BAD REQUEST: Can't find pScheduler or BWCTL on psps-test2-mgmt.rrze.uni-erlangen.de

2017/05/05 08:20:11 (2090) WARN> perfsonar_meshconfig_agent:430 main:: - Problem determining which pscheduler to submit test to for creation, skipping test: 400 BAD REQUEST: Can't find pScheduler or BWCTL on psps-test2-mgmt.rrze.uni-erlangen.de

2017/05/05 08:20:11 (2090) WARN> perfsonar_meshconfig_agent:430 main:: - Problem determining which pscheduler to submit test to for creation, skipping test: 400 BAD REQUEST: Can't find pScheduler or BWCTL on psps-test2-mgmt.rrze.uni-erlangen.de

but I am able through CLI to get this result:

[dfn.garnizov@test-rhps02 ~]$ pscheduler task --lead-bind test-rhps02.geant.net throughput --source-node test-rhps02.geant.net --source test02-bw-kau-lt.geant.net --dest-node ps-owdtst.rrze.uni-erlangen.de --dest psps-test2-mgmt.rrze.uni-erlangen.de
Submitting task...
Task URL:
https://test-rhps02.geant.net/pscheduler/tasks/6fff2b36-f3a4-4c62-bed8-3ea494302ce0
Running with tool 'iperf3'
Fetching first run...
No runs scheduled for this task.
[dfn.garnizov@test-rhps02 ~]$
igarny commented 7 years ago

here is the reverse test: [dfn.garnizov@test-rhps02 ~]$ pscheduler task --lead-bind ps-owdtst.rrze.uni-erlangen.de --assist ps-owdtst.rrze.uni-erlangen.de --bind test-rhps02.geant.net throughput --dest-node test-rhps02.geant.net --dest 62.40.123.106 --source-node ps-owdtst.rrze.uni-erlangen.de --source psps-test2-mgmt.rrze.uni-erlangen.de --ip-version 4

Submitting with assistance from ps-owdtst.rrze.uni-erlangen.de...
Task URL:
https://ps-owdtst.rrze.uni-erlangen.de/pscheduler/tasks/4dda14e3-0d99-46b3-9cea-08fbdda4c7be
Running with tool 'iperf3'
Fetching first run...

Next scheduled run:
https://ps-owdtst.rrze.uni-erlangen.de/pscheduler/tasks/4dda14e3-0d99-46b3-9cea-08fbdda4c7be/runs/74418a4a-7727-404c-a25f-5f34205770a4
Starts 2017-05-05T11:34:24+02:00 (~65 seconds)
Ends   2017-05-05T11:34:43+02:00 (~18 seconds)
Waiting for result...

Run failed.  The following errors were reported:
By ps-owdtst.rrze.uni-erlangen.de:
    iperf3 returned an error: error - unable to connect to server: Connection refused

By test-rhps02.geant.net:
    iperf3 returned an error:

No runs scheduled for this task.
arlake228 commented 7 years ago

I think those error messages from the meshconfig are not from throughput tests but traceroute tests based on some other info shared last week. It looks like the traceroute tests were setting lead-bind for tasks where the lead-bind address was not on the lead. This would break the backward compatibility check and produce the error shown. I think the traceroute task creating code needs to be reviewed to make sure it is setting lead properly.

igarny commented 7 years ago

Here is the results on my check. Indeed I have problems with the trace tasks initiated from the remote host, towards the Lead participant: https://test-rhps02.geant.net/pscheduler/tasks/fc1d4dc3-ebf6-4cfb-8afa-26970b0d673a/runs/79b6a66d-c108-4948-ac88-ce0534cfa548

Nevertheless the issue reported here WRT meshconfig log is about the inability of the daemon to create tests with pscheduler, which leads me to the conclusion it is not about the trace tasks.

In addition to this I am not able to get ids for any throughput task: [dfn.garnizov@test-rhps02 ~]$ pscheduler schedule -PT5H +PT3H | grep throughput [dfn.garnizov@test-rhps02 ~]$

P.S. the forward direction of the trace works fine

igarny commented 7 years ago

this issue appears outdated. Current status stays with #97