hashicorp / vagrant

Vagrant is a tool for building and distributing development environments.
https://www.vagrantup.com
Other
26.23k stars 4.43k forks source link

Vagrant CLI on Windows is very slow #11853

Open keilma opened 4 years ago

keilma commented 4 years ago

Hello,

I was not able to find something useful to solve this issue so I will try it here. I've a fresh installation of Vagrant (no plugins) on my host, but the response of the CLI is very very slow. For example it takes 19 seconds to get the output of "vagrant version":

PS > Measure-Command { vagrant version }

Days : 0 Hours : 0 Minutes : 0 Seconds : 19 Milliseconds : 460 Ticks : 194606842 TotalDays : 0,000225239400462963 TotalHours : 0,00540574561111111 TotalMinutes : 0,324344736666667 TotalSeconds : 19,4606842 TotalMilliseconds : 19460,6842

Environment: Windows 10 2004 Build 19041.450 (32 GB RAM, Core i7), Vagrant 2.2.10, VirtualBox 6.1.12.

Thanks in advance.

Kind regards Marcel

simplytech commented 1 year ago

In case it's helpful, here are my timings:

MacBook Pro 2015:

$ time vagrant help > /dev/null
real 0m1.656s
user 0m1.278s
sys  0m0.214s

$ time ruby -e ''
real 0m0.150s
user 0m0.071s
sys  0m0.044s

Mac Mini 2018:

$ time vagrant help > /dev/null
real 0m1.378s
user 0m1.264s
sys  0m0.107s

$ time ruby -e ''
real 0m0.107s
user 0m0.064s
sys  0m0.032s

Dell Latitude 5410

C:\>cmd /v:on /c "echo !TIME! & vagrant help >NUL: & echo !TIME!"
18:44:08.58
18:44:15.76

C:\>cmd /v:on /c "echo !TIME! & c:\HashiCorp\Vagrant\embedded\mingw32\bin\ruby.exe -e 'puts :hello' & echo !TIME!"
18:44:30.68
hello
18:44:30.84

So, I conclude that:

gavenkoa commented 1 year ago

@simplytech You can install ts utility from moreutils (I use Cygwin) and have more fun with:

VAGRANT_LOG=debug vagrant help |& ts -s %.S
VAGRANT_LOG=debug vagrant help |& ts -i %.S
ferchuni commented 1 year ago

It is unacceptable. Vagrant 2.3.2 and Win 11. I got 15 seconds for vagrant help and vagrant ssh.

It is very disappointing. It works great under Linux.

AlexC-Sophos commented 1 year ago

Any update on this?

tuxerrante commented 1 year ago

I've just experienced this and found a solution here: https://www.gitmemory.com/issue/hashicorp/vagrant/10521/492835941 Commenting out those functions and just returning false sped up my cli (e.g. vagrant --help) by 4x - i.e. went from 8-9 seconds down to 2

I'm not 100% certain what these functions are required for, but as two of them are for Hyper-V which I'm not using on this machine I'm assuming it should be fairly safe to disable them

C:\HashiCorp\Vagrant\embedded\gems\2.2.14\gems\vagrant-2.2.14\lib\vagrant\util\platform.rb
  def windows_admin?
    return false

  def windows_hyperv_admin?
    return false

  def windows_hyperv_enabled?
    return false

Did you had to restart something than? Is not working for me. I have HyperV enabled since I use it on other VirtualBox VMs, but I don't think it is doing something here.

Windows 11 vagrant version 2.3.4

Vagrant.configure("2") do |config|
  config.vm.box = "generic/ubuntu2204"
  memory = 6144
  cpus = 4

  config.vm.provider :virtualbox do |v|
    v.memory = memory
    v.cpus = cpus
  end

  config.vm.provider :libvirt do |v|
    v.memory = memory
    v.cpus = cpus
  end
gavenkoa commented 1 year ago

Did you had to restart something than? Is not working for me.

I used VirtualBox provider, not HyperV. So the code should be updated for windows_hyperv_enabled?.

The workaround is about replacing costly powershell invocations with constants like True/False, unfortunately it is internals of Vagrant... so the success could vary and nothing is guarantied.

mic345 commented 1 year ago

Hi there,

We're using Vagrant 2.3.4 on Windows 11 and it's still extremely slow, see the attached screenshot. Unfortunately gavenkoa's patch did not work for us.

Any suggestions?

Desktop-screenshot (82)

pcgeek86 commented 1 year ago

Just set up a new Windows 11 system, Ryzen 9 5900X, 64GB DDR4, Samsung 970 PRO NVMe SSD, RTX 2080. Installed Vagrant with scoop package manager.

Vagrant currently takes 4 seconds to run with zero parameters / arguments.

image
Luxvao commented 1 year ago

Will there ever be a fix?

anzz1 commented 1 year ago

Vagrant is by a wide margin the best option for running VMs on Windows, including Docker containers via boot2docker , which I find much better than the native Docker implementation.

However this is a pretty major annoyance that the vagrant ruby script is painfully slow. It seems that independent of the command, there is a 2 second "thinking" period before actually executing the command, which itself is fast. For example, a vagrant --help takes 2031 ms to complete and vagrant global-status takes 2033ms on Windows 7 x64.

It looks like it sleeps for 2030ms and then does the work, and the actual work takes just milliseconds. The other commenters having even larger delays can be chalked up to the fact that Windows peaked on 7 and has been a dumpster fire of hot garbage rolling downhill ever since. Unlike the later iterations, 7 doesn't f*ck with your access times, so the 2 second delay is definitely something on Vagrant's end.

e: to help in narrowing it down, the proposed solution of overriding these functions: C:\HashiCorp\Vagrant\embedded\gems\2.3.4\gems\vagrant-2.3.4\lib\vagrant\util\platform.rb

def windows_admin?
  return true
end
def windows_hyperv_admin?
  return false
end
def windows_hyperv_enabled?
  return false
end

made zero difference, i.e. the delta execution speed was ~100 microseconds, having an insignificant impact.

gavenkoa commented 1 year ago

It looks like it sleeps for 2030ms and then does the work, and the actual work takes just milliseconds

I could confirm it: Ruby VM startup time is about 3-4s on my PC (due to Sophos antivirus intercepting read of files).

@anzz1 You could reveal finer details with VAGRANT_LOG=debug of what is happening after initial Ruby VM startup delay :

VAGRANT_LOG=debug vagrant help
VAGRANT_LOG=debug vagrant help |& ts -s %.S
VAGRANT_LOG=debug vagrant help |& ts -i %.S

It shows lots of activity

anzz1 commented 1 year ago

@anzz1 You could reveal finer details with VAGRANT_LOG=debug of what is happening after initial Ruby VM startup delay

Thanks for this, very useful. Indeed , if I replace C:\HashiCorp\Vagrant\embedded\gems\2.3.4\gems\vagrant-2.3.4\lib\vagrant.rb with just a exit 1 , it still takes 300ms to start up and do nothing at all.

Im leaning towards Ruby simply being slow / unoptimized as a language, that it's not necessarily caused by any bug.

However, there is definitely some room for optimization in Vagrant's side. Looking at the log, I am loading every single host platform/guest platform/provisioner/provider/protocol plugin under the sun when running any command. Shouldn't those be loaded as-needed instead? I don't necessarily need to load a plugin for Trisquel guest, heroku protocol or CFEngine provisioner everytime I run a command to launch my linux vm locally. And for host platforms, why is anything else than Windows host loaded at all when I'm on Windows?

luk-sam commented 1 year ago

varant on windows 10 is a nightmare.

measure-Command { vagrant.exe -h }
Days              : 0
Hours             : 0
Minutes           : 0
Seconds           : 15
Milliseconds      : 298
Ticks             : 152989054
TotalDays         : 0,000177070664351852
TotalHours        : 0,00424969594444444
TotalMinutes      : 0,254981756666667
TotalSeconds      : 15,2989054
TotalMilliseconds : 15298,9054

@hashicorp-cloud Does anyone work on this issue? it seems that MS doesn't have support from hashicorp, otherwise how do you explain a problem that has been going on for 3 years

anzz1 commented 1 year ago

vagrant on windows 10 is a nightmare.

FTFY. When something that takes 2 seconds to run in Windows 7 (which is still too much), takes 15 seconds on Windows 10, there is not much the developers of Ruby, of Vagrant, of Mars Curiosity Rover, or anyone can do. And Windows 11 is even worse.

You could try asking Microsoft to scrap the fail that is current-day Windows, and keep supporting W7 indefinitely to have a fast and stable OS. Good luck with that, though.

mic345 commented 1 year ago

Nonsense, it takes 157 millis to create two random files, 1MB each, and then run a diff between the two on Windows 11…

There shouldn't be any reason whatsoever for basic vagrant commands to take full seconds.

$ time ( ( tr -dc "A-Za-z0-9" </dev/urandom | head -c $(( 1024*1024 )) > /tmp/a ) && ( tr -dc "A-Za-z0-9" </dev/urandom | head -c $(( 1024*1024 )) > /tmp/b ) && diff /tmp/a /tmp/b > /dev/null )

real    0m0.157s
user    0m0.015s
sys     0m0.030s

$ uname -a
CYGWIN_NT-10.0 LAPTOP-44CNT3RE 3.3.4(0.341/5/3) 2022-01-31 19:35 x86_64 Cygwin
anzz1 commented 1 year ago

Nonsense, it takes 157 millis to create two random files, 1MB each, and then run a diff between the two on Windows 11…

There shouldn't be any reason whatsoever for basic vagrant commands to take full seconds.

$ time ( ( tr -dc "A-Za-z0-9" </dev/urandom | head -c $(( 1024*1024 )) > /tmp/a ) && ( tr -dc "A-Za-z0-9" </dev/urandom | head -c $(( 1024*1024 )) > /tmp/b ) && diff /tmp/a /tmp/b > /dev/null )

real    0m0.157s
user    0m0.015s
sys     0m0.030s

$ uname -a
CYGWIN_NT-10.0 LAPTOP-44CNT3RE 3.3.4(0.341/5/3) 2022-01-31 19:35 x86_64 Cygwin

There is so much wrong in your methodology of testing I/O perf under Windows that I do not even know where to start. To even begin to understand why everything about that is wrong, I suggest starting from the basics, firing up a debugger and running your command under it and see how much it takes to actually reach a kernel syscall which actually does file I/O (hint: NtReadFile/ZwReadFile & NtWriteFile/ZwWriteFile)

mic345 commented 1 year ago

The entire point is that even with I/O the operation takes 1/50 of the time it takes to run a simple Vagrant command on the same system...

You can run the same without I/O, see if it takes longer 😉

anzz1 commented 1 year ago

The entire point is that even with I/O the operation takes 1/50 of the time it takes to run a simple Vagrant command on the same system...

You can run the same without I/O, see if it takes longer 😉

And my point is that when you run the same command on Windows 7, it takes 2 seconds instead of 15. I don't understand what is the argument you are trying to make. It seems that you do not either. Yes, sure, there are some words, but they make very little sense. What part is exactly "nonsense" according to you, the expert?

npc203 commented 1 year ago

For me any cli command takes around a minute, running with --debug, shows that it's stuck at

DEBUG meta: validating LANG value for virtualbox cli commands
 INFO subprocess: Starting process: ["C:\\HashiCorp\\Vagrant\\embedded\\usr\\bin/locale.EXE", "-a"]
DEBUG subprocess: Selecting on IO

Sure enough, if I run the locale.exe -a separately as well, it takes a long while, any ideas to fix it?

wiretail commented 1 year ago

I recollect (but can’t test right now) you can just rename/remove locale.exe and it will skip and still work.

npc203 commented 1 year ago

@rgreer4 Yup, that just worked like a charm, deleting locale.exe reduced the time taken from 1minute-ish to 4-7 seconds!

I'd say, still sluggish for a cli, but very much usable for me, thanks!

CrackerJackMack commented 8 months ago

deleting locale.exe sped up my commands. no -- I didn't measure, but I can run commands without getting bored.

JasonD94 commented 4 months ago

I've noticed this same issue on Windows 11 Pro (23H2) but interestingly when I switched to dual booting with Fedora 40 (KDE Plasma version on Linux 6.8.9-300.fc40.x86_64) I noticed significantly faster vagrant CLI speeds. Here's an example with vagrant status:

# Windows 11 Pro
$ time vagrant status
real    0m4.980s
user    0m0.000s
sys     0m0.000s

# Fedora 40 KDE Plasma
jdowning@fedora:tcbase$ time vagrant status
real    0m1.701s
user    0m0.838s
sys     0m0.574s

Under Linux things are about 3x faster for "basic" CLI commands (status, box list, version, help, etc - things that probably only do a handful of checks).

This gets more interesting when I look at bringing up a Rocky Linux VM guest with Vagrant. I have a number of ports forwarded, 3 shared folders mounted, etc so I expect this to take some time but the difference between Windows 11 and Fedora 40 is crazy:

# Windows 11 Pro
$ time vagrant up
real    12m0.201s   # yes, this is 12 MINUTES to boot up a VM...
user    0m0.000s
sys     0m0.015s

# Fedora 40
jdowning@fedora:tcbase$ time vagrant up
real    0m37.070s
user    0m3.896s
sys     0m3.963s

We're talking 24x slower on Windows vs Linux. To add additional notes, I'm using Virtualbox as the guest provider - Version 7.0.14 r161095 (Qt5.15.2) on both Windows 11 and Fedora 40. I am also using GitBash on Windows 11 and bash on Fedora 40. It seems to me, looking at DEBUG output from the Windows 11 logs I collected, that on Windows Powershell may be used at times, even when running under GitBash.

I also noticed that mounting three shared folders took about 6 mins in Windows 11 vs about 9 seconds on Fedora. Another slow part seems to be the machine booting process. On Windows, this takes about 5 mins between boot/checking for guest additions. On Linux this is the longest step, but it still only takes 24 seconds.

I've tried a few of the things in this thread to see if I can speed up Windows 11. I used the platform.rb changes from this comment: https://github.com/hashicorp/vagrant/issues/11853#issuecomment-1236185448

I also tried deleting locale.EXE as suggested recently. Nothing seems to have improved things. It seems like the theory that Powershell or Ruby is slowing things down on Windows might be something worth looking into. I've collected some debug logs with --debug-timestamps that I'm analyzing now, but I wanted to share that this is still a problem even in 2024.

JasonD94 commented 4 months ago

I might have found a potential solution to my problem. It appears that on Windows 11 Pro some Hyper-V settings were enabled by default. I found that this was causing Virtualbox to enter "Snail execution mode":

00:00:00.852350 NEM: NEMR3Init: Snail execution mode is active!
00:00:00.852350 NEM: Note! VirtualBox is not able to run at its full potential in this execution mode.
00:00:00.852350 NEM:       To see VirtualBox run at max speed you need to disable all Windows features
00:00:00.852350 NEM:       making use of Hyper-V.  That is a moving target, so google how and carefully
00:00:00.852350 NEM:       consider the consequences of disabling these features.

(these are logs from Virtualbox which you can find by going to the Virtualbox GUI and clicking on Machine -> Show Log or hitting CTRL + L)

It seems like this was the problem. After following steps listed in this Virtualbox forum post: https://forums.virtualbox.org/viewtopic.php?f=25&t=99390

I find that vagrant up now takes about 2 mins under Windows 11:

$ time vagrant up
Bringing machine 'dev' up with 'virtualbox' provider...
==> dev: Clearing any previously set forwarded ports...
==> dev: Fixed port collision for 8080 => 8080. Now on port 2200.
==> dev: Fixed port collision for 9000 => 9000. Now on port 2201.
==> dev: Clearing any previously set network interfaces...
==> dev: Preparing network interfaces based on configuration...
    dev: Adapter 1: nat
    dev: Adapter 2: intnet
==> dev: Forwarding ports...
    dev: 22 (guest) => 2222 (host) (adapter 1)
    dev: 25 (guest) => 2525 (host) (adapter 1)
    dev: 8080 (guest) => 2200 (host) (adapter 1)
    dev: 6006 (guest) => 6006 (host) (adapter 1)
    dev: 80 (guest) => 8081 (host) (adapter 1)
    dev: 443 (guest) => 8443 (host) (adapter 1)
    dev: 18082 (guest) => 18082 (host) (adapter 1)
    dev: 8983 (guest) => 8983 (host) (adapter 1)
    dev: 9876 (guest) => 9876 (host) (adapter 1)
    dev: 9000 (guest) => 2201 (host) (adapter 1)
    dev: 15672 (guest) => 15672 (host) (adapter 1)
==> dev: Booting VM...
==> dev: Waiting for machine to boot. This may take a few minutes...
    dev: SSH address: 127.0.0.1:2222
    dev: SSH username: vagrant
    dev: SSH auth method: private key
==> dev: Machine booted and ready!
[dev] GuestAdditions 7.0.14 running --- OK.
==> dev: Checking for guest additions in VM...
==> dev: Setting hostname...
==> dev: Configuring and enabling network interfaces...
==> dev: Mounting shared folders...
    dev: # censored this part, but I have 3 shared folders that I bring up for software dev purposes
==> dev: Machine already provisioned. Run `vagrant provision` or use the `--provision`
==> dev: flag to force provisioning. Provisioners marked to run always will still run.

real    2m2.227s
user    0m0.000s
sys     0m0.000s

This is still several times slow than Linux, but it's at least faster than before by several times.

Highly recommend folks on Windows using Vagrant with Virtualbox take a look at their Hyper-V settings and disable them if possible. Seems like Snail execution mode can cause a lot of slow down with Virtualbox VMs that vagrant seems to handle, but will perform slower than necessary if the Hyper-V programs / features in Windows 11 are disabled.

colemar commented 3 months ago

Vagrant 2.4.1 fresh installation on Windows 10 x64. Both vagrant help and vagrant version spend about 20 seconds on the last line of this debug log:

DEBUG bundler: resolving solution from available specification set
DEBUG bundler: solution set for configured plugins has been resolved
DEBUG bundler: activating solution set
DEBUG bundler: Activating solution set: []
DEBUG solution_file: plugin file does not exist, not storing solution
DEBUG bundler: solution set stored to - <Vagrant::Bundler::SolutionFile:C:/Users/colem/.vagrant.d/plugins.json:C:/Users/colem/.vagrant.d/bundler/global.sol:invalid>
 INFO manager: Loading plugins...
 INFO loader: Loading configuration in order: [:home, :root]
DEBUG loader: Configuration loaded successfully, finalizing and returning
DEBUG push: finalizing
 INFO subprocess: Starting process: ["c:\\windows\\system32\\windowspowershell\\v1.0\\/powershell.EXE", "-NoLogo", "-NoProfile", "-NonInteractive", "-ExecutionPolicy", "Bypass", "-Command", "Write-Output $PSVersionTable.PSVersion.Major"]
 INFO subprocess: Command not in installer, restoring original environment...
DEBUG subprocess: Selecting on IO

In this case Powershell is the culprit, and I already knew Powershell (any version) is very slow to start. Precompiling the assemblies with ngen.exe did not help.

behniafb commented 2 months ago

Solution: Uninstall Vagrant on Windows & use Vagrant on WSL (namely Linux).