perfsonar / mesh-config

Centralized configuration framework for measurement points and GUIs
Apache License 2.0
2 stars 0 forks source link

meshconfig -tasks.conf file esmond/latency block parsing failure #106

Closed igarny closed 6 years ago

igarny commented 6 years ago

Hi Andy

We (with Antoine) are preparing a training for an NREN and I have some toolkit servers installed by Antoine. I believe he did not add any customizations there.

The issue, that I have found is, that a generic configuration for the measurement archive leads to errors in meshconfig:

<default_parameters>
    receive_port_range   8760-9960
    type   powstream
</default_parameters>
<measurement_archive>
    database   https://localhost/esmond/perfsonar/archive/
    type   esmond/latency
    password   -------------------------------------------------------
</measurement_archive>
<measurement_archive>
    database   https://localhost/esmond/perfsonar/archive/
    type   esmond/throughput
    password   -------------------------------------------------------
</measurement_archive>
<measurement_archive>
    type   esmond/traceroute
    database   https://localhost/esmond/perfsonar/archive/
    password   -------------------------------------------------------
</measurement_archive>
2017/11/17 11:14:24 (2040) ERROR> perfsonar_meshconfig_agent:420 main:: - Error building pScheduler task: Odd number of parameters in call to perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondBase::__ANON__ when named parameters were expected
 at /usr/share/perl5/perfSONAR_PS/RegularTesting/MeasurementArchives/EsmondBase.pm line 387.
        perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondBase::__ANON__(perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondLatency=HASH(0x7d5a738), "local_address", "mp01.jisc.edu.pert", "default_retry_policy") called at /usr/lib/x86_64-linux-gnu/perl5/5.22/Moose/Meta/Method/Overridden.pm line 38
        perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondBase::to_pscheduler(perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondLatency=HASH(0x7d5a738), "local_address", "mp01.jisc.edu.pert", "default_retry_policy") called at /usr/share/perl5/perfSONAR_PS/RegularTesting/Tests/Powstream.pm line 559
        perfSONAR_PS::RegularTesting::Tests::Powstream::__ANON__(perfSONAR_PS::RegularTesting::Tests::Powstream=HASH(0x7d82d40), "url", "https://127.0.0.1/pscheduler", "test", perfSONAR_PS::RegularTesting::Test=HASH(0x7d5a348), "task_manager", perfSONAR_PS::Client::PScheduler::TaskManager=HASH(0x7d4e020), "archive_map", HASH(0x6ef1890), ...) called at /usr/lib/x86_64-linux-gnu/perl5/5.22/Moose/Meta/Method/Overridden.pm line 38
        perfSONAR_PS::RegularTesting::Tests::Powstream::to_pscheduler(perfSONAR_PS::RegularTesting::Tests::Powstream=HASH(0x7d82d40), "url", "https://127.0.0.1/pscheduler", "test", perfSONAR_PS::RegularTesting::Test=HASH(0x7d5a348), "task_manager", perfSONAR_PS::Client::PScheduler::TaskManager=HASH(0x7d4e020), "archive_map", HASH(0x6ef1890), ...) called at /usr/lib/perfsonar/bin/perfsonar_meshconfig_agent line 411
        eval {...} called at /usr/lib/perfsonar/bin/perfsonar_meshconfig_agent line 409
2017/11/17 11:14:38 (2040) INFO> perfsonar_meshconfig_agent:438 main:: - Added 3 new tasks, and deleted 0 old tasks
2017/11/17 11:53:57 (11770) ERROR> perfsonar_meshconfig_agent:420 main:: - Error building pScheduler task: Odd number of parameters in call to perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondBase::__ANON__ when named parameters were expected
 at /usr/share/perl5/perfSONAR_PS/RegularTesting/MeasurementArchives/EsmondBase.pm line 387.
        perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondBase::__ANON__(perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondLatency=HASH(0x7529690), "local_address", "mp01.jisc.edu.pert", "default_retry_policy") called at /usr/lib/x86_64-linux-gnu/perl5/5.22/Moose/Meta/Method/Overridden.pm line 38
        perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondBase::to_pscheduler(perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondLatency=HASH(0x7529690), "local_address", "mp01.jisc.edu.pert", "default_retry_policy") called at /usr/share/perl5/perfSONAR_PS/RegularTesting/Tests/Powstream.pm line 559
        perfSONAR_PS::RegularTesting::Tests::Powstream::__ANON__(perfSONAR_PS::RegularTesting::Tests::Powstream=HASH(0x752f888), "url", "https://127.0.0.1/pscheduler", "test", perfSONAR_PS::RegularTesting::Test=HASH(0x72ae7e8), "task_manager", perfSONAR_PS::Client::PScheduler::TaskManager=HASH(0x752fc18), "archive_map", HASH(0x6752d00), ...) called at /usr/lib/x86_64-linux-gnu/perl5/5.22/Moose/Meta/Method/Overridden.pm line 38
        perfSONAR_PS::RegularTesting::Tests::Powstream::to_pscheduler(perfSONAR_PS::RegularTesting::Tests::Powstream=HASH(0x752f888), "url", "https://127.0.0.1/pscheduler", "test", perfSONAR_PS::RegularTesting::Test=HASH(0x72ae7e8), "task_manager", perfSONAR_PS::Client::PScheduler::TaskManager=HASH(0x752fc18), "archive_map", HASH(0x6752d00), ...) called at /usr/lib/perfsonar/bin/perfsonar_meshconfig_agent line 411
        eval {...} called at /usr/lib/perfsonar/bin/perfsonar_meshconfig_agent line 409
2017/11/17 11:54:10 (11770) INFO> perfsonar_meshconfig_agent:438 main:: - Added 3 new tasks, and deleted 0 old tasks
2017/11/17 11:59:15 (13100) ERROR> perfsonar_meshconfig_agent:420 main:: - Error building pScheduler task: Odd number of parameters in call to perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondBase::__ANON__ when named parameters were expected
 at /usr/share/perl5/perfSONAR_PS/RegularTesting/MeasurementArchives/EsmondBase.pm line 387.
        perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondBase::__ANON__(perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondLatency=HASH(0x82e4c28), "local_address", "mp01.jisc.edu.pert", "default_retry_policy") called at /usr/lib/x86_64-linux-gnu/perl5/5.22/Moose/Meta/Method/Overridden.pm line 38
        perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondBase::to_pscheduler(perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondLatency=HASH(0x82e4c28), "local_address", "mp01.jisc.edu.pert", "default_retry_policy") called at /usr/share/perl5/perfSONAR_PS/RegularTesting/Tests/Powstream.pm line 559
        perfSONAR_PS::RegularTesting::Tests::Powstream::__ANON__(perfSONAR_PS::RegularTesting::Tests::Powstream=HASH(0x82ec918), "url", "https://127.0.0.1/pscheduler", "test", perfSONAR_PS::RegularTesting::Test=HASH(0x8045340), "task_manager", perfSONAR_PS::Client::PScheduler::TaskManager=HASH(0x82e4580), "archive_map", HASH(0x750ffa0), ...) called at /usr/lib/x86_64-linux-gnu/perl5/5.22/Moose/Meta/Method/Overridden.pm line 38
        perfSONAR_PS::RegularTesting::Tests::Powstream::to_pscheduler(perfSONAR_PS::RegularTesting::Tests::Powstream=HASH(0x82ec918), "url", "https://127.0.0.1/pscheduler", "test", perfSONAR_PS::RegularTesting::Test=HASH(0x8045340), "task_manager", perfSONAR_PS::Client::PScheduler::TaskManager=HASH(0x82e4580), "archive_map", HASH(0x750ffa0), ...) called at /usr/lib/perfsonar/bin/perfsonar_meshconfig_agent line 411
        eval {...} called at /usr/lib/perfsonar/bin/perfsonar_meshconfig_agent line 409

this results in getting all throughput, rtt, traceroute measurements collected, but appears the powstream measurements are just not initiated In order to further test it I have removed the configuration for esmond/latency. the error events stopped ....along with the measurement results for both rtt and owamp tests

It appears to me the latency parser is still looking for a user name in the esmond/latency block

igarny commented 6 years ago

Unfortunately adding a username field did not help. Here is the full content of the -tasks.file (+ username field). It generates the same error.

<default_parameters>
    type   powstream
    receive_port_range   8760-9960
</default_parameters>
<test>
    description   IPv4 latency
    <parameters>
        force_ipv4   1
        type   powstream
        resolution   3
        inter_packet_time   0.01
    </parameters>
    local_address   mp01.jisc.edu.pert
    <schedule>
        type   streaming
    </schedule>
    added_by_mesh   1
    <created_by>
        uri   http://central.jisc.edu.pert/jisc-lat.json
        name   OWD RTT Training
        agent_type   remote-mesh
    </created_by>
    target   mp02.jisc.edu.pert
    target   mp04.jisc.edu.pert
</test>
<test>
    description   IPv4 rtt
    local_address   mp01.jisc.edu.pert
    <parameters>
        packet_count   6
        force_ipv4   1
        type   bwping
        packet_length   100
    </parameters>
    <schedule>
        interval   300
        type   regular_intervals
    </schedule>
    target   mp02.jisc.edu.pert
    target   mp04.jisc.edu.pert
    added_by_mesh   1
    <created_by>
        agent_type   remote-mesh
        name   OWD RTT Training
        uri   http://central.jisc.edu.pert/jisc-lat.json
    </created_by>
</test>
<test>
    added_by_mesh   1
    <created_by>
        agent_type   remote-mesh
        uri   http://central.jisc.edu.pert/jisc-tput.json
        name   Throughput Traceroute Training
    </created_by>
    target   mp02.jisc.edu.pert
    target   mp04.jisc.edu.pert
    <parameters>
        tool   iperf3
        type   bwctl
        force_ipv4   1
    </parameters>
    local_address   mp01.jisc.edu.pert
    <schedule>
        type   regular_intervals
        random_start_percentage   10
        interval   1800
    </schedule>
    description   IPv4 throughput
</test>
<test>
    <parameters>
        type   bwtraceroute
        force_ipv4   1
    </parameters>
    local_address   mp01.jisc.edu.pert
    <schedule>
        type   regular_intervals
        interval   600
    </schedule>
    description   IPv4 traceroute
    <created_by>
        agent_type   remote-mesh
        uri   http://central.jisc.edu.pert/jisc-tput.json
        name   Throughput Traceroute Training
    </created_by>
    added_by_mesh   1
    target   mp02.jisc.edu.pert
    target   mp04.jisc.edu.pert
</test>
<measurement_archive>
    username   perfsonar
    database   https://localhost/esmond/perfsonar/archive/
    password   -------------------------------------------------------
    type   esmond/latency
</measurement_archive>
<measurement_archive>
    database   https://localhost/esmond/perfsonar/archive/
    type   esmond/throughput
    password   -------------------------------------------------------
</measurement_archive>
<measurement_archive>
    type   esmond/traceroute
    password   -------------------------------------------------------
    database   https://localhost/esmond/perfsonar/archive/
</measurement_archive>
vvidic commented 6 years ago

On Fri, Nov 17, 2017 at 01:38:22PM +0000, Ivan Garnizov wrote:

    perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondBase::__ANON__(perfSONAR_PS::RegularTesting::MeasurementArchives::EsmondLatency=HASH(0x7d5a738), "local_address", "mp01.jisc.edu.pert", "default_retry_policy") called at /usr/lib/x86_64-linux-gnu/perl5/5.22/Moose/Meta/Method/Overridden.pm line 38

Do you have this part somewhere in the config:

local_address mp01.jisc.edu.pert default_retry_policy

-- Valentin Vidic Computer Systems Engineer - Expert Department of Computer Infrastructure and Services Croatian Academic and Research Network - CARNet Josipa Marohnica 5, HR-10000 Zagreb, Croatia tel: +385 1 6661 714, fax. +385 1 6661 635 gsm: +385 91 2480 919 www.CARNet.hr

igarny commented 6 years ago

hmm I have no idea, where that could be.... my task was to configure the mesh...... will try dig out something

igarny commented 6 years ago

/etc/perfsonar/meshconfig-agent.conf is clean, only mesh assignments and everything else is commented

arlake228 commented 6 years ago

If you are going to decrease the packet_interval in your central meshconfig file to 0.01 (100 packets per second) you need to also alter the sample_count in your central mesh file to 6000. The default packet_interval is .1 (10 packets per second) and the default sample_count is 600. If it actually let through what you had, you'd be getting an OWAMP result every 6 seconds instead of every 60. I'd either drop the packet_interval back down to .1 or bump the sample_count to 6000.

That will fix your immediate problem, but obviously the mesh-config is also not handling this case very well. It probably should have let you shoot yourself in the foot with the config you had, and the only reason it failed is because the function it is calling returns an 'undefined' retry policy for tests that produce results less than every 1 minute. What it should do is just not set a retry policy, but instead its producing the error you see. I'm hesitant to actually spend any time fixing this though given the proximity to the 4.0.2 release and the fact that this entire segment of code is getting scrapped in 4.1

igarny commented 6 years ago

ohh... thanks for this hint....I really should not have 100p/s probably then it is good to have this or another type of alarm for these cases..

Thanks! Should we close the issue then