Provided hacky bypass for WinRM usage not working

Dr4s1l commented 1 year ago

Debug output

With latest version of vagrant (2.3.4) there is a common error on WinRM usage :

An error occurred executing a remote WinRM command.

Shell: Cmd
Command: hostname
Message: Digest initialization failed: initialization error

Already referenced by https://github.com/hashicorp/vagrant/issues/12807 wich was closed without a proper fix on Hashicorp official vagrant side.

The downgrade method to 2.2.19 is no longer working on modern version of Virtualbox (7.0) We need a proper fix

Expected behavior

Provisionning trough WinRM should work on modern Virtualbox and Windows VMs

Actual behavior

Not running as expected, making Windows provisionning unusable

Reproduction information

Vagrant version

Taking https://github.com/Orange-Cyberdefense/GOAD as exemple of deployment

Host operating system

Running on Arch linux

Steps to reproduce

vagrant up on the provided repository fail

Vagrantfile

Vagrant.configure("2") do |config|

# Uncomment this depending on the provider you want to use
ENV['VAGRANT_DEFAULT_PROVIDER'] = 'virtualbox'
# ENV['VAGRANT_DEFAULT_PROVIDER'] = 'vmware_desktop'

boxes = [
  # windows server 2022 : don't work for now
  #{ :name => "DC01",  :ip => "192.168.56.10", :box => "StefanScherer/windows_2022", :box_version => "2021.08.23", :os => "windows"},
  # windows server 2019
  { :name => "DC01",  :ip => "192.168.56.10", :box => "StefanScherer/windows_2019", :box_version => "2021.05.15", :os => "windows"},
  # windows server 2019
  { :name => "DC02",  :ip => "192.168.56.11", :box => "StefanScherer/windows_2019", :box_version => "2021.05.15", :os => "windows"},
  # windows server 2016
  { :name => "DC03",  :ip => "192.168.56.12", :box => "StefanScherer/windows_2016", :box_version => "2017.12.14", :os => "windows"},
  # windows server 2019
  #{ :name => "SRV01", :ip => "192.168.56.21", :box => "StefanScherer/windows_2019", :box_version => "2020.07.17", :os => "windows"},
  # windows server 2019
  { :name => "SRV02", :ip => "192.168.56.22", :box => "StefanScherer/windows_2019", :box_version => "2020.07.17", :os => "windows"},
  # windows server 2016
  { :name => "SRV03", :ip => "192.168.56.23", :box => "StefanScherer/windows_2016", :box_version => "2019.02.14", :os => "windows"}
  # ELK
# { :name => "elk", :ip => "192.168.56.50", :box => "bento/ubuntu-18.04", :os => "linux",
#   :forwarded_port => [
#     {:guest => 22, :host => 2210, :id => "ssh"}
#   ]
# }
]

# BUILD with a full up to date vm if you don't want version with old vulns 
# ansible versions boxes : https://app.vagrantup.com/jborean93
# boxes = [
#   # windows server 2019
#   { :name => "DC01",  :ip => "192.168.56.10", :box => "jborean93/WindowsServer2019", :os => "windows"},
#   # windows server 2019
#   { :name => "DC02",  :ip => "192.168.56.11", :box => "jborean93/WindowsServer2019", :os => "windows"},
#   # windows server 2016
#   { :name => "DC03",  :ip => "192.168.56.12", :box => "jborean93/WindowsServer2016", :os => "windows"},
#   # windows server 2019
#   { :name => "SRV02", :ip => "192.168.56.22", :box => "jborean93/WindowsServer2019", :os => "windows"},
#   # windows server 2016
#   { :name => "SRV03", :ip => "192.168.56.23", :box => "jborean93/WindowsServer2016", :os => "windows"}
# ]

  config.vm.provider "virtualbox" do |v|
    v.memory = 4000
    v.cpus = 2
  end

  config.vm.provider "vmware_desktop" do |v|
    v.vmx["memsize"] = "4000"
    v.vmx["numvcpus"] = "2"
  end

  # disable rdp forwarded port inherited from StefanScherer box
  config.vm.network :forwarded_port, guest: 3389, host: 3389, id: "rdp", auto_correct: true, disabled: true

  # no autoupdate if vagrant-vbguest is installed
  if Vagrant.has_plugin?("vagrant-vbguest") then
    config.vbguest.auto_update = false
  end

  config.vm.boot_timeout = 600
  config.vm.graceful_halt_timeout = 600
  config.winrm.retry_limit = 30
  config.winrm.retry_delay = 10

  boxes.each do |box|
    config.vm.define box[:name] do |target|
      # BOX
      target.vm.provider "virtualbox" do |v|
        v.name = box[:name]
      end
      target.vm.box_download_insecure = box[:box]
      target.vm.box = box[:box]
      if box.has_key?(:box_version)
        target.vm.box_version = box[:box_version]
      end

      # issues/49
      target.vm.synced_folder '.', '/vagrant', disabled: true

      # IP
      target.vm.network :private_network, ip: box[:ip]

      # OS specific
      if box[:os] == "windows"
        target.vm.guest = :windows
        target.vm.communicator = "winrm"
        target.vm.provision :shell, :path => "vagrant/Install-WMF3Hotfix.ps1", privileged: false
        target.vm.provision :shell, :path => "vagrant/ConfigureRemotingForAnsible.ps1", privileged: false

        # fix ip for vmware
        if ENV['VAGRANT_DEFAULT_PROVIDER'] == "vmware_desktop"
          target.vm.provision :shell, :path => "vagrant/fix_ip.ps1", privileged: false, args: box[:ip]
        end

      else
        target.vm.communicator = "ssh"
      end

      if box.has_key?(:forwarded_port)
        # forwarded port explicit
        box[:forwarded_port] do |forwarded_port|
          target.vm.network :forwarded_port, guest: forwarded_port[:guest], host: forwarded_port[:host], host_ip: "127.0.0.1", id: forwarded_port[:id]
        end
      end

    end
  end
end

soapy1 commented 1 year ago

Hey there thanks for opening up an issue and providing context with links to other related issues. Could I get a little more information about the problem. In particular, it would be helpful to get

the smallest Vagrantfile that reproduces the problem
a debug log

Thanks!

rotorek commented 1 year ago

Hi I'm having the same error while trying to run "vagrant up" after system upgrade on SuSE Tumbleweed. Vagrant 2.3.4 Virtualization: libvirt

I'm attaching smallest Vagrant file and debug log.

Vagrant.configure("2") do |config|
  config.vm.box = "jborean93/WindowsServer2016"
  config.vm.provider "libvirt" do |vb|
    vb.memory = "4096"
    vb.cpus = 2
  end
end

vagrant.log

cwegener commented 1 year ago

Also happens on the current Fedora 37 with the same test box as @rotorek used: Details Vagrant version from Fedora 37 repo: 2.2.19 Ruby version: ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [x86_64-linux] The relevant vagrant plugins need to be installed as the Fedora vagrant package does not include them:

vagrant plugin install winrm
vagrant plugin install winrm-fs
vagrant plugin install winrm-elevated

Output from vagrant plugin list

vagrant-libvirt (0.11.2, global)
winrm (2.3.6, global)
winrm-elevated (1.2.3, global)
winrm-fs (1.3.5, global)

vagrant up output:

Bringing machine 'default' up with 'libvirt' provider...
==> default: Checking if box 'jborean93/WindowsServer2016' version '1.1.0' is up to date...
==> default: Creating image (snapshot of base box volume).
==> default: Creating domain with the following settings...
==> default:  -- Name:              test_default
==> default:  -- Description:       Source: /home/fedora/test/Vagrantfile
==> default:  -- Domain type:       kvm
==> default:  -- Cpus:              2
==> default:  -- Feature:           acpi
==> default:  -- Feature:           apic
==> default:  -- Feature:           pae
==> default:  -- Feature (HyperV):  name=relaxed, state=on
==> default:  -- Feature (HyperV):  name=spinlocks, state=on, retries=8191
==> default:  -- Feature (HyperV):  name=vapic, state=on
==> default:  -- Clock offset:      localtime
==> default:  -- Clock timer:       name=hypervclock, present=yes
==> default:  -- Memory:            4096M
==> default:  -- Base box:          jborean93/WindowsServer2016
==> default:  -- Storage pool:      default
==> default:  -- Image(vda):        /var/lib/libvirt/images/test_default.img, virtio, 40G
==> default:  -- Disk driver opts:  cache='default'
==> default:  -- Graphics Type:     vnc
==> default:  -- Video Type:        qxl
==> default:  -- Video VRAM:        16384
==> default:  -- Video 3D accel:    false
==> default:  -- Keymap:            en-us
==> default:  -- TPM Backend:       passthrough
==> default:  -- INPUT:             type=tablet, bus=usb
==> default: Creating shared folders metadata...
==> default: Starting domain.
==> default: Domain launching with graphics connection settings...
==> default:  -- Graphics Port:      5901
==> default:  -- Graphics IP:        127.0.0.1
==> default:  -- Graphics Password:  Not defined
==> default:  -- Graphics Websocket: 5701
==> default: Waiting for domain to get an IP address...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: WinRM address: 192.168.121.65:5985
    default: WinRM username: vagrant
    default: WinRM execution_time_limit: PT2H
    default: WinRM transport: negotiate
==> default: Removing domain...
==> default: Deleting the machine folder
An error occurred executing a remote WinRM command.

Shell: Cmd
Command: hostname
Message: Digest initialization failed: initialization error

debug.txt

Since we're using his vagrant box as a test case, I'll just ping @jborean93 here to see if he can throw in his brain power into the issue. :smiley:

jborean93 commented 1 year ago

Without looking too much into it I would guess it's a problem with NTLM and the underlying ruby lib that provides NTLM authentication that Vagrant uses. NTLM relies on an older hashing protocol md4 which can be disabled on newer OpenSSL versions. For example on newer hosts this fails in Python because OpenSSL no longer provides support for it

$ python -c "import hashlib; hashlib.new('md4')"

Traceback (most recent call last):
  File "/home/jborean/.pyenv/versions/3.11.0/lib/python3.11/hashlib.py", line 160, in __hash_new
    return _hashlib.new(name, data, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: [digital envelope routines] unsupported

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/jborean/.pyenv/versions/3.11.0/lib/python3.11/hashlib.py", line 166, in __hash_new
    return __get_builtin_constructor(name)(data)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jborean/.pyenv/versions/3.11.0/lib/python3.11/hashlib.py", line 123, in __get_builtin_constructor
    raise ValueError('unsupported hash type ' + name)
ValueError: unsupported hash type md4

Unfortunately unless the underlying ruby lib implements their own md4 hashing function it's reliant on what OpenSSL can do. There might be system wide policies you can set for OpenSSL to allow md4 but I'm unsure whether it's something done at runtime or whether it's part of how an application compiles against OpenSSL.

A workaround is to use basic auth which is bad but unfortunately without NTLM being available is the only other authentication option available for local accounts (excluding cert auth)

cwegener commented 1 year ago

Thanks Jordan! I think that's the right path to investigate. It seems that is an OpenSSL 3.0 thing perhaps:

https://github.com/WinRb/WinRM/issues/340

cwegener commented 1 year ago

Well, enabling MD4 in OpenSSL 3.0 via the legacy_provider using a custom local openssl.cnf copy and setting the OPENSSL_CONF env var to point to my custom local openssl.cnf makes MD4 available again and the winrm transport can now use negotiate NTLM auth again. Thanks @jborean93 for the assist :100:

cwegener commented 1 year ago

On Fedora, there even is a nice convenience section included via the openssl packaging for the systemwide /etc/ssl/openssl.cnf file that has the template stanza for enabling the legacy providers. I guess that RHEL 9 and CentOS 9 probably would share the same package.

https://src.fedoraproject.org/rpms/openssl/blob/rawhide/f/0024-load-legacy-prov.patch

cwegener commented 1 year ago

@rotorek Here's what I did on Fedora 37. Not sure how the contents of the openssl.cnf file look like on tumbleweed.

Create a copy of the system-wide /etc/ssl/openssl.cnf
Modify the copy of the openssl.cnf file as per below patch (the patch is Fedora/RHEL specific)
set the OPENSSL_CONF variable to load the openssl config from the modified version: export OPENSSL_CONF=/home/fedora/openssl.cnf
Run vagrant up

--- /etc/ssl/openssl.cnf        2023-02-09 16:16:42.000000000 +0000
+++ /home/fedora/openssl.cnf    2023-02-22 03:05:49.762547873 +0000
@@ -57,14 +57,14 @@
 # to side-channel attacks and as such have been deprecated.

 [provider_sect]
-##default = default_sect
-##legacy = legacy_sect
-##
-##[default_sect]
-##activate = 1
-##
-##[legacy_sect]
-##activate = 1
+default = default_sect
+legacy = legacy_sect
+
+[default_sect]
+activate = 1
+
+[legacy_sect]
+activate = 1

 [ ssl_module ]