aurora-scheduler / australis

Aurora Scheduler client written in Go
Apache License 2.0
6 stars 10 forks source link

fetch mesos leader #18

Closed ridv closed 3 years ago

ridv commented 4 years ago

It'd be great to add the command australis fetch mesos leader which would get the Mesos leader from ZK

lenhattan86 commented 4 years ago

I will take add this feature.

zorro786 commented 4 years ago

@lenhattan86 See if you can find a way to fetch the leader without specifying zookeeper nodes. And if you can then make this change for aurora leader too.

zorro786 commented 4 years ago

Specifying zookeeper nodes should also be supported, but make it smart to get them if nothing specified.

lenhattan86 commented 4 years ago

I think fetching mesos without zk needs some changes in aurora-scheduler.

zorro786 commented 4 years ago

Think more on this; I can see at least two solutions without modifying aurora/mesos.

ridv commented 4 years ago

Only thing I can think of is making it fall back on localhost if no argument is provided.

zorro786 commented 3 years ago

If you can find out hostname/ip of any master node, finding who is leader is easy. How to find first part? consul is one option, second option if you don't want to do consul business, contact local Mesos agent to get the master directly.

ridv commented 3 years ago

hmm yeah but you can only assume that on a local box. I guess in theory you could go looking for the ZK config for the current mesos agent but that seems kinda complicated. I'd be ok with someone else putting in the legwork for that but something simple will also work for right now

lenhattan86 commented 3 years ago

I will go with zk. Other options will be the futurework.

zorro786 commented 3 years ago

I guess in theory you could go looking for the ZK config for the current mesos agent but that seems kinda complicated.

/state endpoint on agent will give you master info, it is not complicated.

hmm yeah but you can only assume that on a local box.

Yes that's why it's default option.

zorro786 commented 3 years ago

I will go with zk. Other options will be the futurework.

I would rather find master IP by logging into zookeeper directly instead of finding ZK IPs, it's not fast enough and hence not useful for me.

zorro786 commented 3 years ago

I can take this up for current work.

zorro786 commented 3 years ago

@lenhattan86 You work on zk part, I will add default option.

ridv commented 3 years ago

I guess in theory you could go looking for the ZK config for the current mesos agent but that seems kinda complicated.

/state endpoint on agent will give you master info, it is not complicated.

What if you're running it from your own laptop? Then the error become /state endpoint not found by default leaving the user to wonder if they did something wrong.

If I recall correctly, that's why I didn't implement the Aurora version that way 😄. I opted for a more clear hey you're missing this error than an opaque /state is not found.

Middle ground would be /state not found. Please provide a mesos agent to use but even then that's wonky because we shouldn't really be running australis on an agent node.

I could be missing something here though so I'll wait til you have your implementation.

hmm yeah but you can only assume that on a local box.

Yes that's why it's default option.

zorro786 commented 3 years ago

Hmm I already thought about that, if we look at it from Australis perspective, it will just query localhost:5051/state by default. Now if something is wrong i.e. nothing running on 5051, state is not available etc. and after we retry a couple of times for a retriable error, we just say "Unable to fetch leader from local Mesos agent, please try again with -zk option". We keep it simple. How does this sound?

I didn't think about asking user to provide an agent URL with an option, that can be done too. So "Unable to fetch leader from local Mesos agent, please try again with -zk or -agent option".

I am okay with both.

ridv commented 3 years ago

Either way works, give it a shot lets see what we come up with. Maybe we'll apply the same logic to the aurora leader find

lenhattan86 commented 3 years ago

@zorro786 I created the PR for this issue here https://github.com/aurora-scheduler/australis/pull/19. Waiting for @ridv to merge it. Feel free to work on the option "not using ZK", then we can close this issue.

zorro786 commented 3 years ago

Planning on it sometime this week.

ridv commented 3 years ago

Closed via #19 and #20 thanks @lenhattan86 and @zorro786