oracle / vagrant-projects

Vagrant projects for Oracle products and other examples
Universal Permissive License v1.0
946 stars 477 forks source link

OL8U7: fails with firewalld IP masquerading #460

Closed hussam-qasem closed 1 year ago

hussam-qasem commented 1 year ago

My Vagrantfile:

Vagrant.configure("2") do |config|
  config.vm.box = "oraclelinux/8"
  config.vm.box_url = "https://oracle.github.io/vagrant-projects/boxes/oraclelinux/8.json"
end

After doing vagrant up & vagrant ssh:

[vagrant@localhost ~]$ sudo systemctl enable --now firewalld
Created symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service → /usr/lib/systemd/system/firewalld.service.
Created symlink /etc/systemd/system/multi-user.target.wants/firewalld.service → /usr/lib/systemd/system/firewalld.service.

[vagrant@localhost ~]$ sudo firewall-cmd --add-masquerade --permanent
success

[vagrant@localhost ~]$ sudo firewall-cmd --query-masquerade
no

[vagrant@localhost ~]$ sudo firewall-cmd --add-masquerade
Error: COMMAND_FAILED: 'python-nftables' failed: internal:0:0-0: Error: Could not process rule: No such file or directory

JSON blob:
{
  "nftables": [
    {
      "metainfo": {
        "json_schema_version": 1
      }
    },
    {
      "add": {
        "rule": {
          "family": "ip",
          "table": "firewalld",
          "chain": "nat_POST_public_allow",
          "expr": [
            {
              "match": {
                "left": {
                  "meta": {
                    "key": "oifname"
                  }
                },
                "op": "!=",
                "right": "lo"
              }
            },
            {
              "masquerade": null
            }
          ]
        }
      }
    },
    {
      "add": {
        "rule": {
          "family": "inet",
          "table": "firewalld",
          "chain": "filter_FWDO_public_allow",
          "expr": [
            {
              "match": {
                "left": {
                  "ct": {
                    "key": "state"
                  }
                },
                "op": "in",
                "right": {
                  "set": [
                    "new",
                    "untracked"
                  ]
                }
              }
            },
            {
              "accept": null
            }
          ]
        }
      }
    }
  ]
}
AmedeeBulle commented 1 year ago

This is an interesting one...

The reason if fails is because the netfilter module nft_masq is not installed.

As from UEK7, we split the kernel in two parts: kernel-uek-core and kernel-uek-modules. kernel-uek-core is supposed to have all the necessary parts to run a cloud instance; kernel-uek-modules has all the rest.

It appears that kernel-uek-core has only a subset of the netfilter modules, to get masquerading you need kernel-uek-modules.

Running

dnf install  kernel-uek-module kernel-uek-modules-$(uname -r)

should solve the issue...

(See Changes to UEK Content Distribution and Packaging)

PaulNeumann commented 1 year ago

@AmedeeBulle

Thank you! This also explains a seemingly unrelated issue. For VMs created from the OL8U7 base box, the VirtualBox Guest Additions break whenever the kernel is updated, and manually rebuilding the GAs also fails. Running dnf install -y kernel-uek-modules-$(uname -r) resolves the problem.

Would it be feasible to include kernel-uek-modules (and the linux-firmware dependency) in future versions of the OL8 base box for the VirtualBox provider? If not, would you consider a PR that adds installing the package to the install.sh scripts for the projects that use the OL8 base box?

AmedeeBulle commented 1 year ago

@PaulNeumann

Can you confirm the rebuild fails for the latest OL8U7 box (v8.7.377)? It passes in our test suite, but I might miss something.

Rebuilding VB 7.x GA on OL8UEK7 requires GCC toolset 11, but rcvboxadd doesn't have the path properly setup (Orabug 34811820).

UEK7 should make our life more easier, as the the GA are now part of the kernel... ... except that the drivers are in kernel-uek-modules as well! We have orabug 34820755 open to move the GA in kernel-uek-core.

All this is still in flux, but I have a strong preference to keep the base images small by keeping the modules and the firmware out of the box, unless absolutely required. If that doesn't work, we will add the modules in the box; having to install these in every provisioning script defeats the purpose.

PaulNeumann commented 1 year ago

@AmedeeBulle

Yes, the GA rebuild fails for v8.7.377 of the OL8U7 box. I used the following Vagrantfile:

Vagrant.configure("2") do |config|
  config.vm.box = "oraclelinux/8"
  config.vm.box_url = "https://oracle.github.io/vagrant-projects/boxes/oraclelinux/8.json"
  config.vm.box_version = "8.7.377"
end

Running vagrant up, vagrant ssh, and (inside the VM) dnf -y upgrade upgrades the kernel from 5.15.0-3.60.5.1.el8uek.x86_64 to 5.15.0-5.76.5.1.el8uek.x86_64.

After exiting the ssh session, vagrant reload results in the following output:

==> default: Attempting graceful shutdown of VM...
==> default: Checking if box 'oraclelinux/8' version '8.7.377' is up to date...
==> default: Clearing any previously set forwarded ports...
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 (guest) => 2222 (host) (adapter 1)
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2222
    default: SSH username: vagrant
    default: SSH auth method: private key
==> default: Machine booted and ready!
==> default: Checking for guest additions in VM...
    default: No guest additions were detected on the base box for this VM! Guest
    default: additions are required for forwarded ports, shared folders, host only
    default: networking, and more. If SSH fails on this machine, please install
    default: the guest additions and repackage the box to continue.
    default:
    default: This is not an error message; everything may continue to work properly,
    default: in which case you may ignore this message.
==> default: Mounting shared folders...
    default: /vagrant => C:/Users/neumannp/Desktop/test
Vagrant was unable to mount VirtualBox shared folders. This is usually
because the filesystem "vboxsf" is not available. This filesystem is
made available via the VirtualBox Guest Additions and kernel module.
Please verify that these guest additions are properly installed in the
guest. This is not a bug in Vagrant and is usually caused by a faulty
Vagrant box. For context, the command attempted was:

mount -t vboxsf -o uid=1000,gid=1000,_netdev vagrant /vagrant

The error output from the command was:

/sbin/mount.vboxsf: mounting failed with the error: No such device

And running lsmod | grep vbox inside the VM shows no output.

Running dnf -y install kernel-uek-modules, then /sbin/rcvboxadd setup, then reloading the VM fixes the problem.

I understand wanting to keep the box as small as possible. Since a workaround exists, it might be better to wait until the internal issues are resolved before making any changes.

As always, thank you for your time and expertise.

AmedeeBulle commented 1 year ago

@PaulNeumann

TL;DR -- The following will solve the GA rebuild issue:

cp /sbin/rcvboxadd /opt/VBoxGuestAdditions-7.0.4/init/vboxadd

Explanation: I have modified /sbin/rcvboxadd to workaround Orabug 34811820 which prevents GA rebuild (use of GCC toolset v8, while UEK7 requires v11). However, the vboxadd service which ensures modules are present at boot time uses the copy of the script at /opt/VBoxGuestAdditions-7.0.4/init/vboxadd!

But in any case you don't need kernel-uek-modules to rebuild the GA.

(I missed that as in the test suite I only check one can rebuild GA with /sbin/rcvboxadd; at build time I can never upgrade as boxes are built with the latest kernel...)

PaulNeumann commented 1 year ago

@AmedeeBulle

This works perfectly. Thank you!

hussam-qasem commented 1 year ago

dnf install kernel-uek-module kernel-uek-modules-$(uname -r) fixes the problem.