perfsonar / mesh-config

Centralized configuration framework for measurement points and GUIs
Apache License 2.0
2 stars 0 forks source link

Update changes to MA auth token faster #37

Closed arlake228 closed 7 years ago

arlake228 commented 7 years ago

Right now the MA will not recreate a task if the auth token changes until it expires (by default 24 hours). This is because when it compares the task it gets back from pscheduler to the task it needs to create, the auth-token is ignored since pscheduler (correctly) does not share the token. The mesh-config-agent should be able to detect when the local value changes though, so we should do this and recreate the task if it does.

This replaces perfsonar/pscheduler#223 since it is a mesh-config issue.

igarny commented 7 years ago

It appears though that the 24h limit is disregarded or not taken in consideration.

Nov 16 12:05:19 psps-test2-mgmt archiver-esmond/archive DEBUG Archiver received: {u'last-attempt': u'2016-11-15T18:19:09+01:00', u'attempts': 13, u'data': {u'url': u'http://localhost/esmond/perf sonar/archive/', u'retry-policy': [{u'attempts': 5, u'wait': u'PT60S'}, {u'attempts': 3, u'wait': u'PT5M'}, {u'attempts': 24, u'wait': u'PT60M'}], u'_auth-token': u'045a23926e3f8cdfd7589fdbb52928f0 4768e76a', u'measurement-agent': u'ps-test.ctc.grnoc.iu.edu'}, u'result': {u'schedule': {u'duration': u'PT1M41S', u'start': u'2016-11-15T15:58:52+01:00'}, u'tool': {u'version': u'1.0', u'name': u't racepath'}, u'participants': [u'psps-test2-mgmt.rrze.uni-erlangen.de'], u'result': {u'paths': [[{u'ip': u'192.168.71.253', u'hostname': None, u'rtt': u'PT0.001364S', u'mtu': 1500}, {u'ip': u'131.18 8.20.201', u'as': {u'owner': u'DFN Verein zur Foerderung eines Deutschen Forschungsnetzes e.V., DE', u'number': 680}, u'hostname': u'constellation.gate.uni-erlangen.de.', u'rtt': u'PT0.001620S', u' mtu': 1500}, {u'ip': u'131.188.20.252', u'as': {u'owner': u'DFN Verein zur Foerderung eines Deutschen Forschungsnetzes e.V., DE', u'number': 680}, u'hostname': u'yamato.gate.uni-erlangen.de.', u'rt t': u'PT0.001626S', u'mtu': 1500}, {u'ip': u'192.44.85.8', u'as': {u'owner': u'DFN Verein zur Foerderung eines Deutschen Forschungsnetzes e.V., DE', u'number': 680}, u'hostname': u'nat1.rrze.uni-er langen.de.', u'rtt': u'PT0.000904S', u'mtu': 1500}, {u'ip': u'192.44.85.12', u'as': {u'owner': u'DFN Verein zur Foerderung eines Deutschen Forschungsnetzes e.V., DE', u'number': 680}, u'hostname': u'yamato.gate.uni-erlangen.de.', u'rtt': u'PT0.000998S', u'mtu': 1500}, {u'ip': u'188.1.234.229', u'as': {u'owner': u'DFN Verein zur Foerderung eines Deutschen Forschungsnetzes e.V., DE', u'number' : 680}, u'hostname': u'cr-erl2-te0-0-0-7-4.x-win.dfn.de.', u'rtt': u'PT0.001115S', u'mtu': 1500}, {u'ip': u'188.1.144.222', u'as': {u'owner': u'DFN Verein zur Foerderung eines Deutschen Forschungsn etzes e.V., DE', u'number': 680}, u'hostname': u'cr-fra2-be11.x-win.dfn.de.', u'rtt': u'PT0.004693S', u'm

[root@psps-test2-mgmt ~]# grep 045a23926e3f8cdfd7589fdbb52928f0 /etc/perfsonar/meshconfig-agent-tasks.conf

I can assure you I have dealt with that problem (bad token) about a week ago. By mistake I had overwritten the agent-tasks file with the old version, not preserving the code. If you go to https://psps-test2-mgmt.rrze.uni-erlangen.de/toolkit you will see that results recollection was successful for quite sometime already

igarny commented 7 years ago

It is not clear why the auth token would not be taken for the tasks. As seen in the logs: Jan 10 09:46:22 psps-test2-mgmt archiver-esmond/archive DEBUG No metadata key found for https://psps-test2-mgmt.rrze.uni-erlangen.de/pscheduler/tasks/3dba6b2c-4a30-4a41-bc35-34acb9377082 Jan 10 09:46:22 psps-test2-mgmt archiver-esmond/archive DEBUG No metadata key found for https://psps-test2-mgmt.rrze.uni-erlangen.de/pscheduler/tasks/3dba6b2c-4a30-4a41-bc35-34acb9377082 Jan 10 09:46:22 psps-test2-mgmt archiver-esmond/archive DEBUG fast_mode is False Jan 10 09:46:22 psps-test2-mgmt archiver-esmond/archive DEBUG fast_mode is False Jan 10 09:46:22 psps-test2-mgmt archiver-esmond/archive DEBUG No metadata key, so posting to esmond Jan 10 09:46:22 psps-test2-mgmt archiver-esmond/archive DEBUG No metadata key, so posting to esmond

My configuration has not changed:

'

database http://localhost/esmond/perfsonar/archive/ password 0755-----------------------e660 summary_window 300 event_type packet-loss-rate summary_type aggregation ' The archiver task details are: curl -k https://psps-test2-mgmt.rrze.uni-erlangen.de/pscheduler/tasks/8ca41f15-e2db-45a8-94dc-50e2de81f5fa {"reference": {"created-by": {"user-agent": "perfsonar-meshconfig", "uuid": "5B553E1C-B647-11E6-88DE-DD12F75D0CC5", "address": "perfsonar-lab-vm.ilab.umnet.umich.edu"}, "description": "Traceroute Between Testbeds"}, "schedule": {"repeat": "PT600S", "until": "2017-01-10T15:46:26Z", "slip": "PT600S"}, "tool": "tracepath", "archives": [{"archiver": "esmond", "data": {"url": "http://localhost/esmond/perfsonar/archive/", "retry-policy": [{"attempts": 5, "wait": "PT60S"}, {"attempts": 3, "wait": "PT5M"}, {"attempts": 24, "wait": "PT60M"}], "_auth-token": null, "measurement-agent": "perfsonar-lab-vm.ilab.umnet.umich.edu"}}, {"archiver": "esmond", "data": {"url": "http://ps-test.ctc.grnoc.iu.edu/esmond/perfsonar/archive/", "retry-policy": [{"attempts": 5, "wait": "PT60S"}, {"attempts": 3, "wait": "PT5M"}, {"attempts": 24, "wait": "PT60M"}], "measurement-agent": "perfsonar-lab-vm.ilab.umnet.umich.edu"}}], "test": {"type": "trace", "spec": {"dest": "perfsonar-lab-vm.ilab.umnet.umich.edu", "source": "ps-owdtst.rrze.uni-erlangen.de", "ip-version": 4, "schema": 1}}, "tools": ["bwctltracepath", "tracepath", "bwctltraceroute", "traceroute"], "schema": 1}
arlake228 commented 7 years ago

You are misreading the message. Metadata key is not the same as authentication key. Metadata key is used to identify the metadata object in esmond, if it doesn't have one cached then it will get it from esmond.

igarny commented 7 years ago

OK. I agree for this one.... Still I found another entry (check the date):

https://psps-test2-mgmt.rrze.uni-erlangen.de/pscheduler/tasks/8ca41f15-e2db-45a8-94dc-50e2de81f5fa

{"reference": {"created-by": {"user-agent": "perfsonar-meshconfig", "uuid": "5B553E1C-B647-11E6-88DE-DD12F75D0CC5", "address": "perfsonar-lab-vm.ilab.umnet.umich.edu"}, "description": "Traceroute Between Testbeds"}, "schedule": {"repeat": "PT600S", "until": "2017-01-10T15:46:26Z", "slip": "PT600S"}, "tool": "tracepath", "archives": [{"archiver": "esmond", "data": {"url": "http://localhost/esmond/perfsonar/archive/", "retry-policy": [{"attempts": 5, "wait": "PT60S"}, {"attempts": 3, "wait": "PT5M"}, {"attempts": 24, "wait": "PT60M"}],

"_auth-token": null,

"measurement-agent": "perfsonar-lab-vm.ilab.umnet.umich.edu"}}, {"archiver": "esmond", "data": {"url": "http://ps-test.ctc.grnoc.iu.edu/esmond/perfsonar/archive/", "retry-policy": [{"attempts": 5, "wait": "PT60S"}, {"attempts": 3, "wait": "PT5M"}, {"attempts": 24, "wait": "PT60M"}], "measurement-agent": "perfsonar-lab-vm.ilab.umnet.umich.edu"}}], "test": {"type": "trace", "spec": {"dest": "perfsonar-lab-vm.ilab.umnet.umich.edu", "source": "ps-owdtst.rrze.uni-erlangen.de", "ip-version": 4, "schema": 1}}, "tools": ["bwctltracepath", "tracepath", "bwctltraceroute", "traceroute"], "schema": 1}

arlake228 commented 7 years ago

we don't display the auth-token in the json very intentionally, so that is what it should look like.

igarny commented 7 years ago

Hmmm this one has another problem as well

How come a remote measurement agent is trying to post data to my local host?

arlake228 commented 7 years ago

Ah that is a known issue I am trying to work on today, I just ran into it as well. Let's say you have two hosts A and B. If host A asks for a throughput, ping, or traceroute test from B to A, it has to send a request to host B, since it will be the lead. Host B will also be responsible for archiving the result since it is the lead (and in the case of ping and traceroute it will be the only participant). The problem is when host A's meshconfig-agent-tasks.conf file has "localhost" in it as the MA. It asks host B to write to localhost, but that is incorrect, and what you are seeing in this case. I am working on fixing this so Host A plugs in a public address when it sees something like localhost. My plan is to make my best quess at the public address when nothing is specified, but also add a public_url to the measurement_archive block for multi-interface cases since its almost assured the best guess will be wrong in a lot of instances so admins will need a way to set it correctly.

arlake228 commented 7 years ago

and since I didn't explicitly say this, the fix will be going in the mesh-config.