Closed ridv closed 3 years ago
I will take add this feature.
@lenhattan86 See if you can find a way to fetch the leader without specifying zookeeper nodes. And if you can then make this change for aurora leader too.
Specifying zookeeper nodes should also be supported, but make it smart to get them if nothing specified.
I think fetching mesos without zk needs some changes in aurora-scheduler.
Think more on this; I can see at least two solutions without modifying aurora/mesos.
Only thing I can think of is making it fall back on localhost if no argument is provided.
If you can find out hostname/ip of any master node, finding who is leader is easy. How to find first part? consul is one option, second option if you don't want to do consul business, contact local Mesos agent to get the master directly.
hmm yeah but you can only assume that on a local box. I guess in theory you could go looking for the ZK config for the current mesos agent but that seems kinda complicated. I'd be ok with someone else putting in the legwork for that but something simple will also work for right now
I will go with zk. Other options will be the futurework.
I guess in theory you could go looking for the ZK config for the current mesos agent but that seems kinda complicated.
/state
endpoint on agent will give you master info, it is not complicated.
hmm yeah but you can only assume that on a local box.
Yes that's why it's default option.
I will go with zk. Other options will be the futurework.
I would rather find master IP by logging into zookeeper directly instead of finding ZK IPs, it's not fast enough and hence not useful for me.
I can take this up for current work.
@lenhattan86 You work on zk part, I will add default option.
I guess in theory you could go looking for the ZK config for the current mesos agent but that seems kinda complicated.
/state
endpoint on agent will give you master info, it is not complicated.What if you're running it from your own laptop? Then the error become
/state
endpoint not found by default leaving the user to wonder if they did something wrong.
If I recall correctly, that's why I didn't implement the Aurora version that way 😄. I opted for a more clear hey you're missing this
error than an opaque /state
is not found.
Middle ground would be /state not found. Please provide a mesos agent to use
but even then that's wonky because we shouldn't really be running australis on an agent node.
I could be missing something here though so I'll wait til you have your implementation.
hmm yeah but you can only assume that on a local box.
Yes that's why it's default option.
Hmm I already thought about that, if we look at it from Australis perspective, it will just query localhost:5051/state
by default. Now if something is wrong i.e. nothing running on 5051, state is not available etc. and after we retry a couple of times for a retriable error, we just say "Unable to fetch leader from local Mesos agent, please try again with -zk option". We keep it simple. How does this sound?
I didn't think about asking user to provide an agent URL with an option, that can be done too. So "Unable to fetch leader from local Mesos agent, please try again with -zk or -agent option".
I am okay with both.
Either way works, give it a shot lets see what we come up with. Maybe we'll apply the same logic to the aurora leader find
@zorro786 I created the PR for this issue here https://github.com/aurora-scheduler/australis/pull/19. Waiting for @ridv to merge it. Feel free to work on the option "not using ZK", then we can close this issue.
Planning on it sometime this week.
Closed via #19 and #20 thanks @lenhattan86 and @zorro786
It'd be great to add the command
australis fetch mesos leader
which would get the Mesos leader from ZK