perfsonar / mesh-config

Centralized configuration framework for measurement points and GUIs
Apache License 2.0
2 stars 0 forks source link

faulty task definition with bind options from mesconfig #63

Closed igarny closed 7 years ago

igarny commented 7 years ago

Hi Andy, with my meshconfig-agent-tasks.conf file at: https://test-rhps02.geant.net/meshconfig-agent-tasks.conf

the faulty task I get is (wrong "lead-bind": "test-rhps02.geant.net" )

curl -k https://test-rhps02.geant.net/pscheduler/tasks/558df6ef-6445-4ba9-abbf-92181c4c3095
{"reference": {"created-by": {"address": "test02-bw-kau-lt.geant.net", "uuid": "3409879e-b52b-4cf7-a0c4-8d50b75cfd60", "user-agent": "perfsonar-meshconfig"}, "description": "GN BWCTL Traceroute6 internal testing"}, "lead-bind": "test-rhps02.geant.net", "tool": "traceroute", "schedule": {"repeat": "PT1200S", "until": "2017-05-05T09:36:03Z", "slip": "PT1200S", "sliprand": true}, "archives": [{"data": {"url": "http://test-psma-gn-01-buc-ro.geant.net/esmond/perfsonar/archive/", "retry-policy": [{"attempts": 1, "wait": "PT60S"}, {"attempts": 1, "wait": "PT300S"}], "_auth-token": null, "measurement-agent": "test02-bw-kau-lt.geant.net"}, "archiver": "esmond", "ttl": "PT3600S"}, {"data": {"url": "http://uat-psma-gn-01-buc-ro.geant.net/esmond/perfsonar/archive/", "retry-policy": [{"attempts": 1, "wait": "PT60S"}, {"attempts": 1, "wait": "PT300S"}], "_auth-token": null, "measurement-agent": "test02-bw-kau-lt.geant.net"}, "archiver": "esmond", "ttl": "PT3600S"}], "test": {"type": "trace", "spec": {"dest": "test02-bw-kau-lt.geant.net", "hops": 64, "length": 40, "schema": 1, "source": "psps-test2-mgmt.rrze.uni-erlangen.de"}}, "schema": 1}

which fails with:

pscheduler result https://test-rhps02.geant.net/pscheduler/tasks/558df6ef-6445-4ba9-abbf-92181c4c3095/runs/cea61197-2ef7-42a5-83aa-f9b6750b9f7b
2017-05-04T11:50:16Z on test02-bw-kau-lt.geant.net with traceroute:

trace --dest test02-bw-kau-lt.geant.net --length 40 --source psps-test2-mgmt.rrze.uni-erlangen.de --hops 64

Test failed.

Diagnostics from test02-bw-kau-lt.geant.net:
    traceroute -q 1 -4 -s psps-test2-mgmt.rrze.uni-erlangen.de -m 64 -N 64 -n test02-bw-kau-lt.geant.net 40

Error from test02-bw-kau-lt.geant.net:

    bind: Cannot assign requested address

I was able to manually construct the pscheduler command with:

pscheduler task --bind test-rhps02.geant.net --assist ps-owdtst.rrze.uni-erlangen.de --debug trace --dest test02-bw-kau-lt.geant.net --ip-version 4 --source-node ps-owdtst.rrze.uni-erlangen.de --source psps-test2-mgmt.rrze.uni-erlangen.de

2017-05-04T13:28:43 Debug signal ignored; already not debugging
2017-05-04T13:28:43 Debug discontinued
2017-05-04T13:28:43 Assistance is from ps-owdtst.rrze.uni-erlangen.de
2017-05-04T13:28:43 Using slip of PT5M
2017-05-04T13:28:43 Converting to spec via https://ps-owdtst.rrze.uni-erlangen.de/pscheduler/tests/trace/spec
Submitting with assistance from ps-owdtst.rrze.uni-erlangen.de...
2017-05-04T13:28:51 Fetching participant list
2017-05-04T13:28:51 Spec is: {"dest": "test02-bw-kau-lt.geant.net", "source": "psps-test2-mgmt.rrze.uni-erlangen.de", "ip-version": 4, "source-node": "ps-owdtst.rrze.uni-erlangen.de", "schema": 1}
2017-05-04T13:28:58 Got participants: {u'participants': [u'ps-owdtst.rrze.uni-erlangen.de']}
2017-05-04T13:28:58 Lead is ps-owdtst.rrze.uni-erlangen.de
2017-05-04T13:28:58 Pinging https://ps-owdtst.rrze.uni-erlangen.de/pscheduler/
2017-05-04T13:29:03 ps-owdtst.rrze.uni-erlangen.de is up
2017-05-04T13:29:03 Posting task to https://ps-owdtst.rrze.uni-erlangen.de/pscheduler/tasks
2017-05-04T13:29:03 Data is {"test": {"type": "trace", "spec": {"dest": "test02-bw-kau-lt.geant.net", "source": "psps-test2-mgmt.rrze.uni-erlangen.de", "ip-version": 4, "source-node": "ps-owdtst.rrze.uni-erlangen.de", "schema": 1}}, "schedule": {"slip": "PT5M"}, "schema": 1}
Task URL:
https://ps-owdtst.rrze.uni-erlangen.de/pscheduler/tasks/abe2b8f7-a289-49bb-9304-36ee2f09cabe
2017-05-04T13:29:13 Posted https://ps-owdtst.rrze.uni-erlangen.de/pscheduler/tasks/abe2b8f7-a289-49bb-9304-36ee2f09cabe
Running with tool 'traceroute'
Fetching first run...
2017-05-04T13:29:20 Fetching https://ps-owdtst.rrze.uni-erlangen.de/pscheduler/tasks/abe2b8f7-a289-49bb-9304-36ee2f09cabe/runs/first
2017-05-04T13:29:28 Handing off: pscheduler watch --format text/plain --debug --bind test-rhps02.geant.net https://ps-owdtst.rrze.uni-erlangen.de/pscheduler/tasks/abe2b8f7-a289-49bb-9304-36ee2f09cabe
2017-05-04T13:29:28 Debug signal ignored; already not debugging
2017-05-04T13:29:28 Debug discontinued
2017-05-04T13:29:28 Fetching https://ps-owdtst.rrze.uni-erlangen.de/pscheduler/tasks/abe2b8f7-a289-49bb-9304-36ee2f09cabe
No runs scheduled for this task.

Please do not not be mislead by "No runs scheduled for this task.". The results do appear after a while.

igarny commented 7 years ago

As Andy identified it: The main source of the problem is that by mistake the test participant is assigned with a wrong bind option/address: meshconfig-agent-tasks.conf file

bind_address   test-rhps02.geant.net
local_lead_bind_address   test-rhps02.geant.net
<test>
    added_by_mesh   1
    description   GN Throughput internal testing
    <schedule>
        random_start_percentage   10
        type   regular_intervals
        interval   10800
    </schedule>
    <parameters>
        tool   iperf3
        omit_interval   5
        type   bwctl
        force_ipv4   1
    </parameters>
    bind_address   test-rhps02.geant.net
    <target>
        bind_address   test-rhps02.geant.net
        address   psps-test2-mgmt.rrze.uni-erlangen.de
    </target>
    local_address   test02-bw-kau-lt.geant.net
    <created_by>
        name   GEANT Test Mesh
        agent_type   remote-mesh
        uri   https://ps-owdtst.rrze.uni-erlangen.de/testmesh.json
    </created_by>
</test>

current mesh is: https://psps-test2-mgmt.rrze.uni-erlangen.de/testmesh.json

igarny commented 7 years ago

I believe I have found a combination, with which meshconfig-agent is not complaining