BerlinVagrant / vagrant-dns

A plugin to manage DNS records for vagrant environments
MIT License
490 stars 50 forks source link

Add dhcp support #51

Closed fnordfish closed 1 year ago

fnordfish commented 7 years ago

open issue: The restart_host_dns action hook needs to get moved after provider specific actions (so that the host network had a chance to get configured). (Edit: we just use the "up" event and assume network is ready by then) see mitchellh/vagrant#8500

mattiasb commented 1 year ago

Would I be correct in saying that what we need is to use the new hooks provided in hashicorp/vagrant#10615 in lib/vagrant-dns.rb:L41-L49?

And probably what we want to do is to not hook in to machine_action_up anymore but instead to a new hook that we are certain will be triggered after network is configured?

I wonder what we should do with the machine_action_reload hook. If it works like I imagine it would it will be called when a machine is reconfigured but possibly before network is up again? I suppose one just needs to test.

Just checking what your thoughts here are. Sorry for pressing on about this, it's just that since I got bitten by vagrant-landrush/landrush#360 I need to find a new solution for or test environment at work.

mattiasb commented 1 year ago

Looking at the code some more and testing it out it seems that the machine_action_up hook is indeed called before network is up. Which is a bit weird.

If I add this to my Vagrantfile:

  config.trigger.after :up do |trigger|
    trigger.name = "Finished Message"
    trigger.info = "== TRIGGER== : Machine is up!"
  end

… I can get the trigger to run after everything else:

$ vagrant up mgmt.test
Bringing machine 'mgmt.test' up with 'libvirt' provider...
==> mgmt.test: Checking if box 'company/ubuntu-22.04.1-server' version '0.14.1' is up to date...
==> mgmt.test: Creating image (snapshot of base box volume).
==> mgmt.test: Creating domain with the following settings...
==> mgmt.test:  -- Name:              AnsiblePlaybooks_mgmt.test
==> mgmt.test:  -- Description:       Source: /home/mattiasb/Code/git.company.example.com/SRE/AnsiblePlaybooks/Vagrantfile
==> mgmt.test:  -- Domain type:       kvm
==> mgmt.test:  -- Cpus:              2
==> mgmt.test:  -- Feature:           acpi
==> mgmt.test:  -- Feature:           apic
==> mgmt.test:  -- Feature:           pae
==> mgmt.test:  -- Clock offset:      utc
==> mgmt.test:  -- Memory:            2048M
==> mgmt.test:  -- Management MAC:    
==> mgmt.test:  -- Loader:            /usr/share/OVMF/OVMF_CODE.fd
==> mgmt.test:  -- Nvram:             
==> mgmt.test:  -- Base box:          company/ubuntu-22.04.1-server
==> mgmt.test:  -- Storage pool:      default
==> mgmt.test:  -- Image():     /var/lib/libvirt/images/AnsiblePlaybooks_mgmt.test.img, 30G
==> mgmt.test:  -- Disk driver opts:  cache='none'
==> mgmt.test:  -- Kernel:            
==> mgmt.test:  -- Initrd:            
==> mgmt.test:  -- Graphics Type:     vnc
==> mgmt.test:  -- Graphics Port:     -1
==> mgmt.test:  -- Graphics IP:       127.0.0.1
==> mgmt.test:  -- Graphics Password: Not defined
==> mgmt.test:  -- Video Type:        cirrus
==> mgmt.test:  -- Video VRAM:        9216
==> mgmt.test:  -- Video 3D accel:    false
==> mgmt.test:  -- Sound Type:  
==> mgmt.test:  -- Keymap:            sv
==> mgmt.test:  -- TPM Backend:       passthrough
==> mgmt.test:  -- TPM Path:          
==> mgmt.test:  -- INPUT:             type=mouse, bus=ps2
==> mgmt.test: Creating shared folders metadata...
==> mgmt.test: Starting domain.
==> mgmt.test: Waiting for domain to get an IP address...
==> mgmt.test: Waiting for machine to boot. This may take a few minutes...
    mgmt.test: SSH address: 192.168.121.11:22
    mgmt.test: SSH username: root
    mgmt.test: SSH auth method: private key
    mgmt.test: Warning: Connection refused. Retrying...
    mgmt.test: 
    mgmt.test: Vagrant insecure key detected. Vagrant will automatically replace
    mgmt.test: this with a newly generated keypair for better security.
    mgmt.test: 
    mgmt.test: Inserting generated public key within guest...
    mgmt.test: Removing insecure key from the guest if it's present...
    mgmt.test: Key inserted! Disconnecting and reconnecting using new SSH key...
==> mgmt.test: Machine booted and ready!
vagrant-dns: trying to stop process with pid 240292 sending TERM and waiting 20s ...
vagrant-dns: process with pid 240292 successfully stopped.
vagrant-dns: process with pid 240882 started.
==> mgmt.test: Restarted DNS Service
==> mgmt.test: Setting hostname...
==> mgmt.test: Configuring and enabling network interfaces...
==> mgmt.test: Running provisioner: shell...
    mgmt.test: Running: inline script
==> mgmt.test: Running provisioner: shell...
    mgmt.test: Running: inline script
==> mgmt.test: Running action triggers after up ...
==> mgmt.test: Running trigger: Finished Message...  ← ← ← My trigger runs here!
==> mgmt.test: == TRIGGER== : Machine is up!         ← ← ← My trigger runs here!

How to get that inside a plugin though is still beyond me. :/

fnordfish commented 1 year ago

@mattiasb I've updated this PR a bit. Would you mind giving it a try?

mattiasb commented 1 year ago

@mattiasb I've updated this PR a bit. Would you mind giving it a try?

Omg, yes! I have a couple of really busy days though, so if I seem to have forgotten that's why. If I don't find time tomorrow though then early next week!

fnordfish commented 1 year ago

No hurry, take your time. Thanks a lot!

mattiasb commented 1 year ago

I got it working! :) I had to do this to Gemfile:

@@ -13,11 +13,6 @@ group :plugins do
 end

 group :test, :development do
-  if ENV['TEST_VAGRANT_VERSION'] == 'HEAD'
-    gem 'vagrant', :git => 'https://github.com/hashicorp/vagrant', :branch => 'main'
-  else
-    gem 'vagrant', :git => 'https://github.com/hashicorp/vagrant', :tag => ENV['TEST_VAGRANT_VERSION']
-  end
   gem 'rubydns', '~> 2.0.0'
 end

(Fedora is on Ruby 3.2.1 which is too new for Vagrant apparently, they probably did some packaging magic).

... and then add this to my Vagrantfile:

  config.dns.ip = -> (vm, opts) do
    ip = nil
    vm.communicate.execute("hostname -I | cut -d ' ' -f 1") do |type, data|
      ip = data.strip if type == :stdout
    end
    ip
  end

But then after a couple restarts and stuff it indeed worked!

I get this when destroying the VM though:

vagrant-dns: trying to stop process with pid 18963 sending TERM and waiting 20s ...
vagrant-dns: trying to stop process with pid 18963 forcefully :( sending KILL and waiting 20s 

... which also means that that process takes a bunch of time. Do you think it might be related? (When I get time I can try and see if I get the same on master).

fnordfish commented 1 year ago

Great news!

That you had to remove the vagrant dependency is a bit weird, but it might indeed have something to do with the more modern ruby you used to build the gem. I assume you build the gem file via rake build and then used vagrant plugin install - is that right?

The manual config.dns.ip setup shouldn't be needed. If a DHCP network is detected, the default would essentially be:

config.dns.ip = proc { |vm|
  vm.guest.capability(:read_ip_address).tap { |ip|
    if ip
      vm.ui.info "[vagrant-dns] Identified DHCP IP as '#{ip}'."
    else
      vm.ui.warn "[vagrant-dns] Could not identify DHCP IP."
    end
  }
}

Again, that is if a DHCP network is detected. If it cannot be detected, for whatever reason, you'll to declare it like you did, or copy'n'paste the default. I probably need to make that more clear in the Readme. Also, if you got any suggestion how to make that easier from the perspective of an Vagrantfile-author, please let me know.

I'm not quite sure why the dns process hangs. Can you have a look into ~/.vagrant.d/tmp/dns/daemon/vagrant-dns.output and try to manually vagrant dns --stop|--start|--restart. Maybe there's another/old process still running and the startup does not fail as expected. pgrep -alf vagrant-dns should give you a list.

mattiasb commented 1 year ago

Great news!

Right! ☺

That you had to remove the vagrant dependency is a bit weird, but it might indeed have something to do with the more modern ruby you used to build the gem. I assume you build the gem file via rake build and then used vagrant plugin install - is that right?

Yeah I believe that's the case. It's one of the quirks of living on the bleeding edge.

And yeah I did this basically:

vagrant-dns $ echo 3.2.0 > .ruby_version
vagrant-dns $ rm pkg/vagrant-dns-2.3.0.gem ; bundle install && bundle exec rake build
vagrant-dns $ cd path/to/my-project
my-project $ vagrant plugin install --local path/to/vagrant-dns/pkg/pkg/vagrant-dns-2.3.0.gem

The manual config.dns.ip setup shouldn't be needed. If a DHCP network is detected, the default would essentially be:

config.dns.ip = proc { |vm|
  vm.guest.capability(:read_ip_address).tap { |ip|
    if ip
      vm.ui.info "[vagrant-dns] Identified DHCP IP as '#{ip}'."
    else
      vm.ui.warn "[vagrant-dns] Could not identify DHCP IP."
    end
  }
}

Again, that is if a DHCP network is detected. If it cannot be detected, for whatever reason, you'll to declare it like you did, or copy'n'paste the default. I probably need to make that more clear in the Readme.

I tried to understand what the code was doing when looking at this but I must admit I failed. I'd like to look into why DHCP isn't detected. Could you point me to where in the code that check is made?

Also, if you got any suggestion how to make that easier from the perspective of an Vagrantfile-author, please let me know.

Sure, I'll keep that in mind! If possible what one wants is zero configuration (except for setting the TLD and the IP/Port of the vagrant-dns process for example). I hope that's possible.

I'm not quite sure why the dns process hangs. Can you have a look into ~/.vagrant.d/tmp/dns /daemon/vagrant-dns.output and try to manually vagrant dns --stop|--start|--restart. Maybe > there's another/old process still running and the startup does not fail as expected. pgrep -alf vagrant-dns should give you a list.

Yeah, I'll try to debug this!

Thanks!

fnordfish commented 1 year ago

The diff got quite messy b/c I did quite some whitespace changes :/ Anyways, here's the DHCP detection: https://github.com/BerlinVagrant/vagrant-dns/blob/a2917b1a78fa3919c77c9149d281914fa75b4dfb/lib/vagrant-dns/configurator.rb#L173-L177

It looks for any private or public network with type dhcp. I've tested using config.vm.network :private_network, type: :dhcp.

There are some quirks regarding public_network - they heavily depend on the provider. Eg. you cannot configure a static IP on VMware desktop. So it's essentially DHCP, but not marked as such.

Maybe I should change the detection to look for static IP configs, and if none is found fall back to "dhcp-mode".

mattiasb commented 1 year ago

The diff got quite messy b/c I did quite some whitespace changes :/ Anyways, here's the DHCP detection:

https://github.com/BerlinVagrant/vagrant-dns/blob/a2917b1a78fa3919c77c9149d281914fa75b4dfb/lib/vagrant-dns/configurator.rb#L173-L177

It looks for any private or public network with type dhcp. I've tested using config.vm.network :private_network, type: :dhcp.

Ah gotcha, thanks!

There are some quirks regarding public_network - they heavily depend on the provider. Eg. you cannot configure a static IP on VMware desktop. So it's essentially DHCP, but not marked as such.

That might be what's happening with libvirt with the user session. As an unprivileged user you're not allowed to set up your own network etc and hence if you don't configure anything what you get is DHCP. Maybe the code would work if I set the network to DHCP manually?

I will test that now.

Maybe I should change the detection to look for static IP configs, and if none is found fall back to "dhcp-mode".

That sounds like a potential solution, and I saw that you did that now. I'll try out setting DHCP manually using the older code path and then if that works I'll see if the new code lets me avoid adding that config.

mattiasb commented 1 year ago

There are some quirks regarding public_network - they heavily depend on the provider. Eg. you cannot configure a static IP on VMware desktop. So it's essentially DHCP, but not marked as such.

That might be what's happening with libvirt with the user session. As an unprivileged user you're not allowed to set up your own network etc and hence if you don't configure anything what you get is DHCP. Maybe the code would work if I set the network to DHCP manually?

I will test that now.

I got it working with both:

config.vm.network "private_network", type: "dhcp"

(EDIT: NM this (↑↑) fails with no network with matching name 'vagrant-private-dhcp' and it's a libvirt thing).

... and ...

      server.vm.network :public_network,
                        :dev => "virbr0",
                        :mode => "bridge",
                        :type => "bridge"

... and now when I test it actually works just fine with the default config (that is without setting up a network at all since it then defaults to dhcp).

I feel slightly gaslighted by Vagrant and libvirt right now to be honest but it sounds like the code you wrote initially works just fine!

I double checked .vagrant/plugins/gems/3.1.3/gems/vagrant-dns-2.3.0/lib/vagrant-dns/configurator.rb to make sure I had the old code and yeah:

    private def has_dhcp_network?(opts)
      opts[:networks].any? { |(nw_type, nw_config)|
        nw_config[:type] == :dhcp && (nw_type == :private_network || nw_type == :public_network)
      }

Maybe I should change the detection to look for static IP configs, and if none is found fall back to "dhcp-mode".

That sounds like a potential solution, and I saw that you did that now. I'll try out setting DHCP manually using the older code path and then if that works I'll see if the new code lets me avoid adding that config.

I think you can revert those changes if you want to. With that said I'll test with them now as well so that you get an extra data point.

fnordfish commented 1 year ago

Thanks for testing! I guess the new "dhcp" detection is simpler ("if there's no static IP configured, assume DHCP mode"), and I'd rather keep that.

Using the updated code, you should be able to remove the config.dns.ip config in your Vagrantfile.

mattiasb commented 1 year ago

Thanks for testing! I guess the new "dhcp" detection is simpler ("if there's no static IP configured, assume DHCP mode"), and I'd rather keep that.

Makes sense!

EDIT: And the new code works just as fine for me as the older did!

Using the updated code, you should be able to remove the config.dns.ip config in your Vagrantfile.

Yeah I did that during all the tests and it resolved my mgmt.test host just fine! Great job!

mattiasb commented 1 year ago

🙌🎉 🍾