Closed merlijn-sebrechts closed 7 years ago
Thank you for reporting this. By "some deployments" you mean some clouds/environments OR you mean random failures? I guess i am asking for a way to reproduce the issue.
Some clouds/environments. Each Hadoop node should be able to resolve the hostname of the other Hadoop nodes. If you for example deploy a cluster in containers, you will get problems (afaik, containers can't resolve each other's hostnames by default).
Specifically for our case, we use containers on the manual provider. To test this, either deploy it in containers locally or in a manual environment (containers in a manual environment work out-of-the-box if you keep all the containers on the bootstrap-host).
The problem with the /etc/hosts
approach is that it adds a lot of complexity to the charms, which then spreads to every charm that might need to connect to those charms, and it potentially floods the relation data and can also easily cause a lot of spurious -relation-changed
hooks to fire.
I think a much more maintainable approach is to use proper DNS. However, since this is a cross-cloud, cross-charm issue, I strongly feel that some form of DNS should be provided by Juju out of the box. I think there was actually a very simple proof-of-concept that was done for Juju to provide a simple DNS for units and we're looking to get that cleaned up and in to core.
In the meantime, I think it would be worth looking at @chuckbutler's dns and dns-helper charms. Ideally, we'd be able to tack those on to a deployment without changes to the charms and then just drop them out again when core provides basic DNS.
Yes! Proper dns support sounds awesome! This would also fix the issues with the Hadoop UI's. Because they use the hostname to link to each other, they don't work properly without editing your laptop's /etc/hosts file.. I'll look into the Charms you linked.
Op dinsdag 31 mei 2016 heeft Cory Johns notifications@github.com het volgende geschreven:
The problem with the /etc/hosts approach is that it adds a lot of complexity to the charms, which then spreads to every charm that might need to connect to those charms, and it potentially floods the relation data and can also easily cause a lot of spurious -relation-changed hooks to fire.
I think a much more maintainable approach is to use proper DNS. However, since this is a cross-cloud, cross-charm issue, I strongly feel that some form of DNS should be provided by Juju out of the box. I think there was actually a very simple proof-of-concept that was done for Juju to provide a simple DNS for units and we're looking to get that cleaned up and in to core.
In the meantime, I think it would be worth looking at @chuckbutler's dns and dns-helper charms. Ideally, we'd be able to tack those on to a deployment without changes to the charms and then just drop them out again when core provides basic DNS.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.< https://ci5.googleusercontent.com/proxy/KepLS6Sea7mPzGHM9k4LOSc2URutchePU1wIyI73q_ju59neDNAf36Aib26dTqJZ-bYXVwdYUrtTMRSbEZm--uvklXcTxkgX4nc_Pj-BIAzgH3RCp2mclRtU-jsSt2GzmEr_N6EdmMUvQCn99wk8J3Iqdk2_sA=s0-d-e1-ft#https://github.com/notifications/beacon/ABbH9ZRae0r-Qz6YLMTZ7WwAvHSNK-W1ks5qHFuXgaJpZM4Ip2Md.gif
I would recommend looking at consul (or similar, with auto DNS) rather than lazypowers DNS charms.
@andrewdmcleod What do you mean by "auto DNS"? The dns-helper charm does look like it automatically registers the principal's private-address, which is I think all we need (at least for internal resolution). But I haven't tested it at all.
Also, @galgalesh I realized that for the issue you mentioned, we'd actually need the host to resolve differently depending on whether it was an external client or an internal unit (public vs private address). I'm not sure those charms can handle that at all.
@johnsca after some investigation and a conversation with lazypower, it made more sense to use consul, which automatically registers its client with the consul DNS server due to some issues with the way his dns and helper charms were implemented, and the technical debt required to pull them into layers. I would be happy to be proved wrong!
@johnsca - the dns charm is subject to single node deployment as a BIND9 server (no ha) or relay to rt53 (provider specific). The simpler solution for cluster based DNS would be to use Consul, and consul-agent's spread across your nodes to push dns data back into consul.
All the exciting promises of the old DNS architecture at this point are vapor and the tech debt is much larger for something that is wholly home grown vs consuming consul to do what it already does extremely well.
You're right @andrewdmcleod - my recommendation is to proceed with consul as thats an active target on our roadmap to keep up to date since its a core component to two of our container platforms.
@chuckbutler For this to address @galgalesh's concerns without changes to the Apache Bigtop charms, we would need something like dns-helper which doesn't currently seem to support connecting to https://jujucharms.com/consul/
How much work do you think it would be to add Consul support to the dns-helper charm?
I'm not certain. Its a curl'able rest key/value based approach so i wouldn't think more than an afternoon hack session, and tweaks as we find the bugs there after. And thats a way to keep that subordinate relevant... I like this idea 👍
We have some work on this already using consul-template which deploys the agent i spoke of earlier. @mbruzek do you know if we kept that work up to date? I imagine not since we haven't spoken about it in a while. It may be worth revisiting that little nugget of a charm and making it a layer/interface.
@galgalesh It seems, from further discussion with @chuckbutler, that either suggestion is going to have a race condition that will require changes to the charms to resolve. I'm going to investigate the status of the prototype and see if there's any possibility of that being available in an upcoming beta.
Aight! Thanks for clearing this out! Keep me posted.
2016-06-01 17:38 GMT+02:00 Cory Johns notifications@github.com:
@galgalesh https://github.com/galgalesh It seems, from further discussion with @chuckbutler https://github.com/chuckbutler, that either suggestion is going to have a race condition that will require changes to the charms to resolve. I'm going to investigate the status of the prototype and see if there's any possibility of that being available in an upcoming beta.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/juju-solutions/layer-apache-bigtop-base/issues/14#issuecomment-223033003, or mute the thread https://github.com/notifications/unsubscribe/ABbH9bL-C8R-NYOFKAA15iEGmJybEo8uks5qHadcgaJpZM4Ip2Md .
I'm closing this issue. Making sure hostnames are known should be the responsibility of the provider. Everything is working now that we've switched to MAAS.
The Apache Hadoop Charms distribute each other's hostnames and put them in
/etc/hosts
so hostnames are resolvable between all Hadoop nodes. The Bigtop Charms don't do this, causing Hadoop to fail in some deployments.