basho-labs / riak-mesos-tools

CLI and other tools for interacting with the Riak Mesos Framework.
Apache License 2.0
3 stars 6 forks source link

unable to install multiple simultaenous frameworks #28

Closed seanjensengrey closed 8 years ago

seanjensengrey commented 8 years ago

Attempting to install multiple named frameworks at the same time results in stacktrace dcos.errors.DCOSHTTPException: Error while fetching [http://marathon.mesos:8080/v2/apps]: HTTP 409: Conflict

riak-mesos --config ~/.config/riak-mesos/kv.json framework --framework kayvee install
(x.env) ip-192-168-0-7:test-mesos-0 basho$ riak-mesos --config ~/.config/riak-mesos/kv.json framework --framework kayvee install
Traceback (most recent call last):
  File "/Users/basho/x.env/lib/python3.5/site-packages/dcos/marathon.py", line 120, in _http_req
    return fn(*args, **kwargs)
  File "/Users/basho/x.env/lib/python3.5/site-packages/dcos/http.py", line 281, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/Users/basho/x.env/lib/python3.5/site-packages/dcos/http.py", line 235, in request
    raise DCOSHTTPException(response)
dcos.errors.DCOSHTTPException: Error while fetching [http://marathon.mesos:8080/v2/apps]: HTTP 409: Conflict

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/basho/x.env/bin/riak-mesos", line 9, in <module>
    load_entry_point('riak-mesos==1.1.0', 'console_scripts', 'riak-mesos')()
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 716, in __call__
    return self.main(*args, **kwargs)
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 696, in main
    rv = self.invoke(ctx)
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 1060, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 1060, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 889, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/decorators.py", line 64, in new_func
    return ctx.invoke(f, obj, *args[1:], **kwargs)
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "/Users/basho/x.env/lib/python3.5/site-packages/riak_mesos/commands/cmd_framework.py", line 57, in install
    client.add_app(framework_json)
  File "/Users/basho/x.env/lib/python3.5/site-packages/dcos/marathon.py", line 290, in add_app
    timeout=self._timeout)
  File "/Users/basho/x.env/lib/python3.5/site-packages/dcos/marathon.py", line 122, in _http_req
    raise _to_exception(e.response)
dcos.errors.DCOSException: App or group is locked by one or more deployments. Override with --force.

Now with --debug

Insecure SSL Mode: False
Verbose Mode: True
Debug Mode: True
JSON Mode: False
Using config file: /Users/basho/.config/riak-mesos/kv.json
Defaulting to configuration based URLs
INFO:dcos.http:Sending HTTP ['get'] to ['http://marathon.mesos:8080/ping']: {'Accept': 'application/json'}
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): marathon.mesos
DEBUG:requests.packages.urllib3.connectionpool:"GET /ping HTTP/1.1" 200 5
INFO:dcos.http:Received HTTP response [200]: {'Date': 'Tue, 26 Jul 2016 21:14:53 GMT', 'Pragma': 'no-cache', 'Expires': '0', 'Access-Control-Allow-Credentials': 'true', 'Server': 'Jetty(9.3.z-SNAPSHOT)', 'X-Marathon-Leader': 'http://10.1.14.98:8080', 'Content-Type': 'text/plain;charset=iso-8859-1', 'Cache-Control': 'must-revalidate,no-cache,no-store', 'Content-Length': '5'}
DEBUG:dcos.util:duration: dcos.http._request: 0.10s
HTTP URL: http://marathon.mesos:8080/ping
HTTP Method: GET
HTTP Body: None
HTTP Status: 200
HTTP Response Text: pong

Setting marathon URL to http://marathon.mesos:8080/
INFO:dcos.http:Sending HTTP ['post'] to ['http://marathon.mesos:8080/v2/apps']: {'Accept': 'application/json'}
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): marathon.mesos
DEBUG:requests.packages.urllib3.connectionpool:"POST /v2/apps HTTP/1.1" 409 None
INFO:dcos.http:Received HTTP response [409]: {'Date': 'Tue, 26 Jul 2016 21:14:53 GMT', 'Pragma': 'no-cache', 'Expires': '0', 'Content-Type': 'application/json', 'Transfer-Encoding': 'chunked', 'X-Marathon-Leader': 'http://10.1.14.98:8080', 'Server': 'Jetty(9.3.z-SNAPSHOT)', 'Cache-Control': 'no-cache, no-store, must-revalidate'}
DEBUG:dcos.util:duration: dcos.http._request: 0.07s
Traceback (most recent call last):
  File "/Users/basho/x.env/lib/python3.5/site-packages/dcos/marathon.py", line 120, in _http_req
    return fn(*args, **kwargs)
  File "/Users/basho/x.env/lib/python3.5/site-packages/dcos/http.py", line 281, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/Users/basho/x.env/lib/python3.5/site-packages/dcos/http.py", line 235, in request
    raise DCOSHTTPException(response)
dcos.errors.DCOSHTTPException: Error while fetching [http://marathon.mesos:8080/v2/apps]: HTTP 409: Conflict

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/basho/x.env/bin/riak-mesos", line 9, in <module>
    load_entry_point('riak-mesos==1.1.0', 'console_scripts', 'riak-mesos')()
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 716, in __call__
    return self.main(*args, **kwargs)
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 696, in main
    rv = self.invoke(ctx)
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 1060, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 1060, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 889, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/decorators.py", line 64, in new_func
    return ctx.invoke(f, obj, *args[1:], **kwargs)
  File "/Users/basho/x.env/lib/python3.5/site-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "/Users/basho/x.env/lib/python3.5/site-packages/riak_mesos/commands/cmd_framework.py", line 57, in install
    client.add_app(framework_json)
  File "/Users/basho/x.env/lib/python3.5/site-packages/dcos/marathon.py", line 290, in add_app
    timeout=self._timeout)
  File "/Users/basho/x.env/lib/python3.5/site-packages/dcos/marathon.py", line 122, in _http_req
    raise _to_exception(e.response)
dcos.errors.DCOSException: App or group is locked by one or more deployments. Override with --force.
seanjensengrey commented 8 years ago

After patching the tools to pass the framework name through, it looks like the scheduler defaults to a framework_name of 'riak'

2016-07-26 21:27:37.540 [error] <0.133.0> Supervisor rms_sup had child rms_scheduler started with erl_mesos_scheduler:start_link(riak_mesos_scheduler, rms_scheduler, [{framework_user,"root"},{framework_name,"riak"},{framework_role,"riak"},{framework_hostname,"ip..."},...], [{master_hosts,["leader.mesos:5050"]},{resubscribe_interval,60000}]) at <0.149.0> exit with reason shutdown in context child_terminated

https://github.com/basho-labs/riak-mesos-scheduler/blob/cb9082762afbbffc5a2f748b995f9dfe23946f2a/src/rms_sup.erl#L57

21:27:46.918 [error] Supervisor rms_sup had child rms_scheduler started with erl_mesos_scheduler:start_link(riak_mesos_scheduler, rms_scheduler, [{framework_user,"root"},{framework_name,"riak"},{framework_role,"riak"},{framework_hostname,"ip..."},...], [{master_hosts,["leader.mesos:5050"]},{resubscribe_interval,60000}]) at <0.169.0> exit with reason shutdown in context child_terminated
21:27:46.918 [error] Supervisor rms_sup had child rms_scheduler started with erl_mesos_scheduler:start_link(riak_mesos_scheduler, rms_scheduler, [{framework_user,"root"},{framework_name,"riak"},{framework_role,"riak"},{framework_hostname,"ip..."},...], [{master_hosts,["leader.mesos:5050"]},{resubscribe_interval,60000}]) at <0.169.0> exit with reason reached_max_restart_intensity in context shutdown
21:27:46.919 [info] Application rms exited with reason: shutdown

https://github.com/basho-labs/riak-mesos-scheduler/search?utf8=%E2%9C%93&q=framework_name

seanjensengrey commented 8 years ago

One thing I noticed was that when I attempted to install the second framework, both would alternatively flap. I think it has to do with them being in the same distributed erlang cluster.

seanjensengrey commented 8 years ago

@sanmiguel this is a both a tools issue and a scheduler issue

seanjensengrey commented 8 years ago

TIL, the location --framework in the command line will dictate what data structure it makes it into.

Top Level

riak-mesos --framework foonoot --config ~/.config/riak-mesos/kv.json framework install

And breakpoint in cmd_framework:install will have

(Pdb) p ctx.framework
'foonoot'
(Pdb) p kwargs
{'config': None, 'info': False, 'debug': False, 'version': False, 'config_schema': False, 'json': False, 'cluster': None, 'node': None, 'verbose': False, 'framework': None, 'insecure_ssl': False, 'home': None}

sub-command parameter

riak-mesos --config ~/.config/riak-mesos/kv.json framework install --framework footsdfasd

yields

> /Users/basho/x.env/lib/python3.5/site-packages/riak_mesos/commands/cmd_framework.py(58)install()
-> ctx.init_args(**kwargs)
(Pdb) p ctx.framework
'riak'
(Pdb) pp kwargs
{'cluster': None,
 'config': None,
 'config_schema': False,
 'debug': False,
 'framework': 'footsdfasd',
 'home': None,
 'info': False,
 'insecure_ssl': False,
 'json': False,
 'node': None,
 'verbose': False,
 'version': False}
sanmiguel commented 8 years ago

Changes made in basho-labs/riak-mesos-scheduler#89 render the hostname and role parameters optional. By default, they take the following value:

I've also merged ( #36 ) some changes that more properly use the framework through the tools. Now you may use a config file that specifies a "framework-name": "riak-foo" and from then on, administer it by using riak-mesos --framework riak-foo ....