Closed jonleighton closed 12 years ago
This has happened to me for a while now and I'm not sure what's up.
I have noticed the same. It feels very "non-deterministic" in that there is an random chance the VM will just keep thrashing at 100% CPU. I found that when I didn't specify a config.vm.network
in the Vagrantfile, there is a much lower chance of the VM entering this state. This makes me think it has something to do with the networking/dhcp configuration. For what it's worth, I also have config.ssh.max_tries = 100
.
Were either of you using config.vm.network
to specify a specific IP? If so, try commenting it out and see if that works.
Same problem here; seems to have started when I updated my lucid32.box to fix #445.
I'm not setting config.vm.network
but it does seem network configuration related - to work around it I'm using config.vm.boot_mode = :gui
and when it gets stuck, manually logging in to the machine and running sudo /etc/init.d/networking restart
.
This happens to me too:
[mathew@thepixeldeveloper]$ vagrant up
[default] VM already created. Booting if its not already running...
[default] Clearing any previously set forwarded ports...
[default] Forwarding ports...
[default] -- ssh: 22 => 2222 (adapter 1)
[default] Cleaning previously set shared folders...
[default] Creating shared folders metadata...
[default] Running any VM customizations...
[default] Booting VM...
[default] Waiting for VM to boot. This can take a few minutes.
[default] Failed to connect to VM!
Failed to connect to VM via SSH. Please verify the VM successfully booted
by looking at the VirtualBox GUI.
I can see the machine has booted with the GUI.
I also tried using the vagrant ssh
command when the vagrant up
command failed. SSH fails with the following error message:
[mathew@thepixeldeveloper]$ vagrant ssh
ssh_exchange_identification: Connection closed by remote host
Rebooting sudo reboot
from the GUI fixes this for me.
Gems
Virtualbox
Virtualbox 4.1.2r73507
bump. I just posted question in a group with a same issue. My vagrant up works only once after reboot/ reinstall Virtual Box.
Quick question, are you guys running boxes built with VeeWee?
I am not
No, clean vagrant. But I think I found a solution - check your networking settings for Virtual Box. (on Mac command +, then Networking, host only networks. I deleted a host only network which happened to be there and now I can restart my VMs without restarting Mac. Folks if you can confirm verify it, that would be excellent.
I only have one networked device listed (NAT).
@AlexMikhalev You have two networked devices because at some point you enabled the
# Assign this VM to a host only network IP, allowing you to access it
# via the IP.
config.vm.network "33.33.33.10"
option which meant Vagrant created the Host-Only interface.
However, still having just the NAT interface was no improvement for me.
@ThePixelDeveloper I didn't mean second adapter on VM, but common settings for host only networks in Virtual Box preferences - basically I removed vboxnet0 completely from my host. But it didn't help in truth.
I suspect this is a VirtualBox bug. The networking interface is failing to get an IP address from the DHCP server for whatever reason. Which releases of Virtualbox are we running? I can rule out Virtualbox 4.1.2r73507 already, I'll go backwards until it's "fixed"
I think it may be related to issue described here: http://blog.techprognosis.com/2011/02/28/how-to-enable-dhcp-in-virtualbox-4.html I have a feeling that DHCP server for NAT addresses broken, but I wasn't able to influence it with commands like: VBoxManage dhcpserver add –netname vboxnet0 –ip 10.0.3.100 –netmask 255.255.255.0 –lowerip 10.0.3.101 –upperip 10.0.3.254 –enable I know it should be for internal network, but I feel that dhcp server for NAT doesn't work or issue incorrect IP addresses.
I don't think it's broken because if it was then you wouldn't be able to get an IP address running sudo dhclient
. Lets see ...
I have another Ubuntu server I just booted and don't have such problems (it doesn't have the VBadditions). I installed the VBadditions and still no problems there. Very very strange.
I do not use VeeWee.
In my VBox logs difference between successful boot and failed in these lines:
00:00:26.584 NAT: IPv6 not supported
00:00:26.622 NAT: DHCP offered IP address 10.0.2.15
00:00:26.623 NAT: DHCP offered IP address 10.0.2.15
while hang up start finishes at:
00:00:24.642 NAT: IPv6 not supported
I use lucid32 and lucid64 base boxes and both have same issue. This issue is not related to vagrant specifically in my case as I have a same problem trying to start vagrant generated boxes from virtual box GUI - sometimes I get ip (10.0.2.15), sometimes I don't - so I need to run sudo dhclient
and get same ip from DHPC server 10.0.2.15.
If I start two VM's - one with lucid32 and other with lucid64, they both have same IP - 10.0.2.15 after I will run `sudo dhclient'
update: I downloaded box from http://opscode-vagrant-boxes.s3.amazonaws.com/ubuntu10.04-gems.box - same behaviour, I can start it first time with vagrant up successfully, shut it down, then attempt to start again with vagrant up hangs forever.
This issue is not related to vagrant specifically in my case as I have a same problem trying to start vagrant generated boxes from virtual box GUI
I mean you should try and install and run the operating system without using a vagrant base box.
I have another that works fine, you should try it too, then we can confirm if it's to do with Vagrant or not.
Look at this for the explanation of the 10.0.2.15 IP Address
Edit. I'm out of ideas on this one. I've built a system using box using VeeWee which works as expected, then seemingly fails once it's been compiled into a box and imported into Vagrant. I have no idea what Vagrant does to the image when it's been packaged, maybe something to look into.
I fixed this for me, or at least I think I did. Start the troubled machine in gui mode, login and execute the following commands as root:
rm /var/lib/dhcp3/*
- Removes any existing DHCP leasesDisable automatic udev rules for network interfaces in Ubuntu
rm /etc/udev/rules.d/70-persistent-net.rules
mkdir /etc/udev/rules.d/70-persistent-net.rules
The machine now starts up and has the correct IP address.
Perhaps this has something to do with the different network adapter MAC addresses. The base box would have been built on a VirtualBox instance where the MAC is different to the one that your using now, just a thought.
ThePixelDeveloper - tried you solution, doesn't work for me on lucid32.
setting gui on and then manually logging in and restarting networking fixed it for me..
Had the same issue. I could work around by booting in gui mode, logging in and manually doing
Any progress on this? It's definitely Vagrant causing trouble here, from my experiments every other machine I've built with VirtualBox (with the same configuration) doesn't show this problem.
To be more clear, something happens when Vagrant builds the box and not when Vagrant launches the box. So booting the box without the help of Vagrant still displays the problem. If someone can point me towards the code where Vagrant does its building I can take a look.
What version of VirtualBox are you all using?
I haven't experienced the problem recently, and I think VirtualBox may have been upgraded on my system at some point after I filed this bug (I'm on Fedora so I have package management...)
My current VirtualBox version is 4.1.2 r73507. Anyone on the same or later and still experiencing this?
Its happening to me on: VirtualBox version is 4.1.2 r73507
I had the issue with the lucid32 box (http://www.vagrantbox.es/1/). Using the ubuntu 11.04 box (http://www.vagrantbox.es/26/) doesn't show the issue.
Same issue here. (Ubuntu 11.04, VirtualBox 4.1.2, vagrant 0.8.6).
I wanted to try ubuntu 11.04 box (http://www.vagrantbox.es/26/) but after downloading I got:
[vagrant] Extracting box...
[vagrant] Verifying box...
[vagrant] Cleaning up downloaded box...
The box file you're attempting to add is invalid. This can be
commonly attributed to typos in the path given to the box add
command. Another common case of this is invalid packaging of the
box itself.
I had a repeatable same issue with Mac OS X Snow Leopard and Ubuntu 10.04 LTS as a virtual box hosts. I repeat it with various boxes - building box using VeeWee or downloading ready ones.
Same issue here. After starting in gui mode, logging in and doing sudo /etc/init.d/networking restart
it'll work from command line again.
This issue is very annoying as it happens on every new box after installing the first one.
I can confirm this is happening on my OS X Lion box as well, problem is with both Lucid64 and Natty64 boxes. I have tried both VirtualBox from 4.1.0 to 4.1.2 and the problem occurs on virtually every vagrant up command. vagrant is now unusable due to this issue :(
Can we confirm it only happens with Vagrant and NOT with a VirtualBox Machine with the same specifications (disk, network, etc ...).
there is a temporary solution until Virtualbox DHCP dhclient fixed:
1) run virtual machine with :gui
2) sudo vi /etc/rc.local
' #/bin/sh -e
' sh /etc/init.d/networking restart
' exit 0
3) sudo halt
Will try this mikhailov, thanks.
probably this better:
sudo vi /etc/network/interfaces
pre-up sleep 10
@mikhailov That doesn't work.
This line is actually included in the VeeWee build scripts: https://github.com/jedi4ever/veewee/blob/master/templates/ubuntu-10.04.3-server-amd64/postinstall.sh#L88
I've used a bigger value and it didn't seem to make a difference.
Wanting to build my own base box while I was having issues with 'vagrant up' I updated to the latest VirtualBox, installed VeeWee and built a new Ubuntu 11.04 box. Since then I haven't had this problem (even with the old boxes).
I did do a gem update after the install of VeeWee - and I did notice that net-ssh was updated as part of this, I'm not sure if it could be related?
@ThePixelDeveloper yes, that seems doesn't work. So I should login with :gui for first time and update /etc/rc.local every time I run a new instance until it fixed
Any progress? Same issue.
Arch Linux 32 Guest Additions Version: 4.1.0 VirtualBox Version: 4.1.2_OSE Vagrant version 0.8.7 Ruby 1.9.2 lucid32 box
I think I've found the problem but I have no idea how to build a new box, so can't test it.
The DHCP client will wait 60 seconds for replies to its request. If there was no response and there are no old leases to fall back to it will then wait five minutes before retrying.
Hopefully adding a shorter timeouts like timeout 2
and retry 2
into /etc/dhcp3/dhclient.conf
will fix the problems.
@leth nope, still same issue for me with that options in dhclient.conf. Use vagrant package
for create new box.
The sudo dhclient
approach works well for me as a temporary fix, I'd love to get this fixed permanently though because this will be the setup process for potentially hundreds of developers at my company.
More information exists at Stack Overflow.
Just checked on Fedora 15, VirtualBox 4.1.4 (latest), vagrant 0.8.7 - the issue still exists.
I guess it's time to pull out git bisect and start the arduous journey.
On 4 October 2011 18:48, Marcin Kulik < reply@reply.github.com>wrote:
Just checked on Fedora 15, VirtualBox 4.1.4 (latest), vagrant 0.8.7 - the issue still exists.
Reply to this email directly or view it on GitHub: https://github.com/mitchellh/vagrant/issues/455#issuecomment-2289222
+1 for git bisect
I think I've pinned it down to /etc/udev/rules.d/70-persistent-net.rules We obviously need to keep it empty, but making it into a directory seems to break things.
I tried making it a non-writable file, but that still broke things.
To fix:
sudo rmdir /etc/udev/rules.d/70-persistent-net.rules
sudo touch /etc/udev/rules.d/70-persistent-net.rules
EDIT:
After 3 successful tries, I tried it a fourth, and it's still broken. >_<
It looks like it might be a virtualbox issue: https://www.virtualbox.org/ticket/4038
I was having the same problem:
vagrant upkarel@rolmops:~/vagrant/c57$ vagrant up [default] Importing base box 'centos-57'... [default] Preparing host only network... [default] Matching MAC address for NAT networking... [default] Clearing any previously set forwarded ports... [default] Forwarding ports... [default] -- ssh: 22 => 2222 (adapter 1) [default] Creating shared folders metadata... [default] Running any VM customizations... [default] Booting VM... [default] Waiting for VM to boot. This can take a few minutes. [default] Failed to connect to VM! Failed to connect to VM via SSH. Please verify the VM successfully booted by looking at the VirtualBox GUI.
My host is ubuntu 11.04 + virtualbox 4.1.4 (vagrant gem 0.8.7). The guest is centos 5.7 + virtualbox 4.1.4. The Vagrantfile has
config.vm.network="33.33.33.10"
If I add config.ssh.max_tries = 150 everything works
But a lot of time gets lost (waiting for a DHCP lease which can't be obtained on that interface - it needs to time out) I could add some configuration to the box which avoids sending DHCP requests - e.g. adding 'dummy' eth1-9 config files disabling those interfaces on the first boot.
Its the same for me. This bug might gonna take a long time to resolve. Everytime rebuilding is eating up my time and my slow connection to dwnld boxes by boxes. Makes tired rebuilding everytime.
But for temporary fix, launching in gui
mode and sudo dhclient
and vg ssh
will work.
I got the same problem, but apparently only with boxes built using Veewee.
I'm using Ubuntu as host, VirtualBox 4.1.4; for Vagrant and Veewee I've tried a lot of combinations.
I don't know the internal of VirtualBox or Vagrant, so I don't know if it's important but I see in ~/.VirtualBox/VBoxSVC.log
some errors detected:
ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={c28be65f-1a8f-43b4-81f1-eb60cb516e66} aComponent={VirtualBox} aText={Could not find a registered machine named 'oct15'}, preserve=false
ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={c28be65f-1a8f-43b4-81f1-eb60cb516e66} aComponent={VirtualBox} aText={Could not find an open hard disk with location '/home/enrico/VirtualBox VMs/oct15/box-disk1.vmdk'}, preserve=false
ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={c28be65f-1a8f-43b4-81f1-eb60cb516e66} aComponent={VirtualBox} aText={Could not find a registered machine named 'oct15_1318713570'}, preserve=false
in ~/.VirtualBox/VirtualBox.xml:
<MachineEntry uuid="{3cfe8af8-96da-41d3-ac3e-7266d3bb8f49}" src="/home/enrico/VirtualBox VMs/oct15_1318713570/oct15_1318713570.vbox"/>
doing a 'ps aux | grep virtual' I see the actual command:
/usr/lib/virtualbox/VBoxHeadless --comment oct15_1318713570 --startvm 3cfe8af8-96da-41d3-ac3e-7266d3bb8f49 --vrde config
Is VirtualBox looking for a box registered with a different uiid, or the aIID in the log is apart from the uiid in the configuration file and in command line?
This one is killing me... having the same problem here but none of the workarounds (dhclient, reboot, restart networking) help me. Is there a combo of older vbox/vagrant/base box that can get me back to work? Peace, Mike
Ah OK, things seem fixed here after much flailing about. Fix appears to be to use vagrant HEAD. Maybe I had a different problem with the same symptoms¿ Hope this is helpful and not just a bunch of noise. -Mike
I can't see any commits since the last release which would fix it. Probably just random luck I suspect.
HI there,
Sometimes (I mean, fairly often, maybe 30-50% of the time for me) vagrant seems to hang on:
I mean possibly it would finish eventually, but I have never waited to potentially infinite length of time to see. It certainly seems to take longer than 'usual', when I do manage to successfully boot.
When this happens, the only thing I can do is poweroff the VM through VBoxManage and try again.
Is there any way I can get more output about what it's doing in order to help debug this?
Cheers