ocp-power-automation / ocp4-upi-powervm

OpenShift on IBM PowerVM servers managed using PowerVC
Apache License 2.0
27 stars 51 forks source link

Add support for LUKS encryption for OCP4 #259

Closed gauravpbankar closed 1 year ago

gauravpbankar commented 1 year ago

This feature enables LUKS encryption variables and updates the install modules to setup LUKS variables. The MachineConfig is automatically used as part of the ocp4-playbooks when LUKS is enabled.

The e2e tests we have completed are - apply, verify luks status, destroy with and without luks enabled.

Please find attached a sample result from the execution:

1.luks-enabled

a. apply

[root@stocked1 ocp4-upi-powervmp9]# terraform output
bastion_ip = "*.*.*.*"
bastion_ssh_command = "ssh root@*.*.*.*"
bootstrap_ip = "*.*.*.*"
cluster_id = "ocp-1711-68be"
etc_hosts_entries = <<EOT

*.*.*.* api.ocp-1711-68be.ibm.com console-openshift-console.apps.ocp-1711-68be.ibm.com integrated-oauth-server-openshift-authentication.apps.ocp-1711-68be.ibm.com oauth-openshift.apps.ocp-1711-68be.ibm.com prometheus-k8s-openshift-monitoring.apps.ocp-1711-68be.ibm.com grafana-openshift-monitoring.apps.ocp-1711-68be.ibm.com example.apps.ocp-1711-68be.ibm.com

EOT
install_status = "COMPLETED"
master_ips = [
  "*.*.*.*",
  "*.*.*.*",
  "*.*.*.*",
]
oc_server_url = "https://api.ocp-1711-68be.ibm.com:6443"
storageclass_name = "nfs-storage-provisioner"
web_console_url = "https://console-openshift-console.apps.ocp-1711-68be.ibm.com"
worker_ips = [
  "*.*.*.*",
  "*.*.*.*",
]

b.verify

Last login: Mon Nov 21 06:59:52 2022 from *.*.*.*
[core@master-1 ~]$ sudo cryptsetup status root
/dev/mapper/root is active and is in use.
  type:    LUKS2
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  key location: keyring
  device:  /dev/sdc4
  sector size:  512
  offset:  32768 sectors
  size:    250826719 sectors
  mode:    read/write
[core@master-1 ~]$ sudo clevis luks list -d  /dev/sdc4
1: sss '{"t":1,"pins":{"tang":[{"url":"http://*.*.*.*:*"}]}}'
[core@master-0 ~]$ sudo cryptsetup status root
/dev/mapper/root is active and is in use.
  type:    LUKS2
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  key location: keyring
  device:  /dev/sdd4
  sector size:  512
  offset:  32768 sectors
  size:    250826719 sectors
  mode:    read/write
[core@master-0 ~]$ sudo clevis luks list -d  /dev/sdc4
1: sss '{"t":1,"pins":{"tang":[{"url":"http://*.*.*.*:*"}]}}'
[core@master-0 ~]$ sudo clevis luks list -d  /dev/sdd4
1: sss '{"t":1,"pins":{"tang":[{"url":"http://*.*.*.*:*"}]}}'

[core@master-2 ~]$ sudo cryptsetup status root
/dev/mapper/root is active and is in use.
  type:    LUKS2
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  key location: keyring
  device:  /dev/sdd4
  sector size:  512
  offset:  32768 sectors
  size:    250826719 sectors
  mode:    read/write
[core@master-2 ~]$ sudo clevis luks list -d  /dev/sdd4
1: sss '{"t":1,"pins":{"tang":[{"url":"http://*.*.*.*:*"}]}}'

Workers:
[core@worker-1 ~]$ sudo cryptsetup status root
/dev/mapper/root is active and is in use.
  type:    LUKS2
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  key location: keyring
  device:  /dev/sdb4
  sector size:  512
  offset:  32768 sectors
  size:    250826719 sectors
  mode:    read/write
[core@worker-1 ~]$ sudo clevis luks list -d  /dev/sdb4
1: sss '{"t":1,"pins":{"tang":[{"url":"http://*.*.*.*:*"}]}}'

[core@worker-0 ~]$ sudo cryptsetup status root
/dev/mapper/root is active and is in use.
  type:    LUKS2
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  key location: keyring
  device:  /dev/sdb4
  sector size:  512
  offset:  32768 sectors
  size:    250826719 sectors
  mode:    read/write
[core@worker-0 ~]$ sudo clevis luks list -d  /dev/sdb4
1: sss '{"t":1,"pins":{"tang":[{"url":"http://*.*.*.*:*"}]}}'

c. destroy completed successfully

2.luks-disabled

a. apply

Apply complete! Resources: 4 added, 0 changed, 1 destroyed.

Outputs:

bastion_private_ip = "*.*.*.*"
bastion_public_ip = "*.*.*.*"
bastion_ssh_command = "ssh -i data/id_rsa root@*.*.*.*"
bootstrap_ip = "*.*.*.*"
cluster_authentication_details = "Cluster authentication details are available in *.*.*.* under ~/openstack-upi/auth"
cluster_id = "luks-1611-bcba"
dns_entries = <<EOT

api.****.com.  IN  A  *.*.*.*
*.****.ibm.com.  IN  A  *.*.*.*

EOT
etc_hosts_entries = <<EOT

*.*.*.* api.****.com console-openshift-console.****.ibm.com integrated-oauth-server-openshift-authentication.****.ibm.com oauth-openshift.****.ibm.com prometheus-k8s-openshift-monitoring.****.ibm.com grafana-openshift-monitoring.****.ibm.com example.****.ibm.com

EOT
install_status = "COMPLETED"
master_ips = [
  "*.*.*.*",
  "*.*.*.*",
  "*.*.*.*",
]
name_prefix = "luks-1611-bcba-syd05-"
oc_server_url = "https://api.****.com:6443"
storageclass_name = "nfs-storage-provisioner"
web_console_url = "https://console-openshift-console.****.ibm.com"
worker_ips = [
  "*.*.*.*",
  "*.*.*.*",
]

b. verify

Masters:

[core@syd05-master-0 ~]$ sudo cryptsetup status root
/dev/mapper/root is inactive.
[core@syd05-master-1 ~]$ sudo cryptsetup status root
/dev/mapper/root is inactive.
[core@syd05-master-2 ~]$ sudo cryptsetup status root
/dev/mapper/root is inactive.

Workers:

[core@syd05-worker-0 ~]$ sudo cryptsetup status root
/dev/mapper/root is inactive.
[core@syd05-worker-1 ~]$ sudo cryptsetup status root
/dev/mapper/root is inactive.

c. destroy completed successfully

Note, we discussed updating install_playbook_tag to a new default for testing. I have left this the same as the origin. Please let me know if you want me to update the tag.

Signed-off-by: Gaurav Bankar Gaurav.Bankar@ibm.com

ppc64le-cloud-bot commented 1 year ago

Welcome @gauravpbankar! It looks like this is your first PR to ocp-power-automation/ocp4-upi-powervm 🎉

ppc64le-cloud-bot commented 1 year ago

Hi @gauravpbankar. Thanks for your PR.

I'm waiting for a ocp-power-automation member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
gauravpbankar commented 1 year ago

Similarly we raised PR related to PowerVS and it got merged. Link fot he same: https://github.com/ocp-power-automation/ocp4-upi-powervs/pull/449

Prajyot-Parab commented 1 year ago

/cc @yussufsh

yussufsh commented 1 year ago

/ok-to-test

yussufsh commented 1 year ago

@gauravpbankar you say you have executed and given the results. But the code is failing to validate itself. Not sure if I can trust the results given. @gauravpbankar @aditijadhav38 please do a complete E2E install using terraform apply with LUKS enabled and disabled, then only ask for review. /cc @prb112

gauravpbankar commented 1 year ago

After adding required changes all tests

@gauravpbankar you say you have executed and given the results. But the code is failing to validate itself. Not sure if I can trust the results given. @gauravpbankar @aditijadhav38 please do a complete E2E install using terraform apply with LUKS enabled and disabled, then only ask for review. /cc @prb112

@yussufsh Yes we have tested with all required changes but for raising PR we have used new repo and it has missed in that repo, but now we are going to take this commit and again test with required changes and will share result here then you can review it. We will update you once the deployment is completed with required changes.

gauravpbankar commented 1 year ago

We have tested it succesfully can you please look into it:

1.luks-enabled a. apply

Apply complete! Resources: 32 added, 0 changed, 0 destroyed.

Outputs:

bastion_ip = "*.*.*.*"
bastion_ssh_command = "ssh root@*.*.*.*"
bootstrap_ip = "*.*.*.*"
cluster_id = "ocp-2211-d1e1"
etc_hosts_entries = <<EOT

*.*.*.* api.ocp-2211-d1e1.ibm.com console-openshift-console.apps.ocp-2211-d1e1.ibm.com integrated-oauth-server-openshift-authentication.apps.ocp-2211-d1e1.ibm.com oauth-openshift.apps.ocp-2211-d1e1.ibm.com prometheus-k8s-openshift-monitoring.apps.ocp-2211-d1e1.ibm.com grafana-openshift-monitoring.apps.ocp-2211-d1e1.ibm.com example.apps.ocp-2211-d1e1.ibm.com

EOT
install_status = "COMPLETED"
master_ips = [
  "*.*.*.*",
  "*.*.*.*",
  "*.*.*.*",
]
oc_server_url = "https://api.ocp-2211-d1e1.ibm.com:6443/"
storageclass_name = "nfs-storage-provisioner"
web_console_url = "https://console-openshift-console.apps.ocp-2211-d1e1.ibm.com/"
worker_ips = [
  "*.*.*.*",
  "*.*.*.*",
]

b. verify

[core@master-0 ~]$ sudo cryptsetup status root
/dev/mapper/root is active and is in use.
  type:    LUKS2
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  key location: keyring
  device:  /dev/sda4
  sector size:  512
  offset:  32768 sectors
  size:    250826719 sectors
  mode:    read/write
[core@master-0 ~]$ sudo clevis luks list -d  /dev/sda4
1: sss '{"t":1,"pins":{"tang":[{"url":"http://*.*.*.*:80"}]}}'
[core@master-0 ~]$

[core@master-1 ~]$ sudo cryptsetup status root
/dev/mapper/root is active and is in use.
  type:    LUKS2
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  key location: keyring
  device:  /dev/sdb4
  sector size:  512
  offset:  32768 sectors
  size:    250826719 sectors
  mode:    read/write
[core@master-1 ~]$ sudo clevis luks list -d  /dev/sda4
1: sss '{"t":1,"pins":{"tang":[{"url":"http://*.*.*.*:80"}]}}'
[core@master-1 ~]$

[core@master-2 ~]$ sudo cryptsetup status root
/dev/mapper/root is active and is in use.
  type:    LUKS2
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  key location: keyring
  device:  /dev/sdd4
  sector size:  512
  offset:  32768 sectors
  size:    250826719 sectors
  mode:    read/write
[core@master-2 ~]$ sudo clevis luks list -d  /dev/sda4
1: sss '{"t":1,"pins":{"tang":[{"url":"http://*.*.*.*:80"}]}}'
[core@worker-0 ~]$ sudo cryptsetup status root
/dev/mapper/root is active and is in use.
  type:    LUKS2
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  key location: keyring
  device:  /dev/sdb4
  sector size:  512
  offset:  32768 sectors
  size:    250826719 sectors
  mode:    read/write
[core@worker-0 ~]$  sudo clevis luks list -d  /dev/sda4
1: sss '{"t":1,"pins":{"tang":[{"url":"http://*.*.*.*:80"}]}}'

[core@worker-1 ~]$ sudo cryptsetup status root
/dev/mapper/root is active and is in use.
  type:    LUKS2
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  key location: keyring
  device:  /dev/sdb4
  sector size:  512
  offset:  32768 sectors
  size:    250826719 sectors
  mode:    read/write
[core@worker-1 ~]$ sudo clevis luks list -d  /dev/sda4
1: sss '{"t":1,"pins":{"tang":[{"url":"http://*.*.*.*:80"}]}}'

2.luks-disabled a. apply

[root@stocked1 ocp4-upi-powervm]# date
Tue Nov 22 04:57:55 PST 2022

Apply complete! Resources: 32 added, 0 changed, 0 destroyed.

Outputs:

bastion_ip = "*.*.*.*"
bastion_ssh_command = "ssh root@*.*.*.*"
bootstrap_ip = "*.*.*.*"
cluster_id = "ocp-2211-f9cd"
etc_hosts_entries = <<EOT

*.*.*.* api.ocp-2211-f9cd.ibm.com console-openshift-console.apps.ocp-2211-f9cd.ibm.com integrated-oauth-server-openshift-authentication.apps.ocp-2211-f9cd.ibm.com oauth-openshift.apps.ocp-2211-f9cd.ibm.com prometheus-k8s-openshift-monitoring.apps.ocp-2211-f9cd.ibm.com grafana-openshift-monitoring.apps.ocp-2211-f9cd.ibm.com example.apps.ocp-2211-f9cd.ibm.com

EOT
install_status = "COMPLETED"
master_ips = [
  "*.*.*.*",
  "*.*.*.*",
  "*.*.*.*",
]
oc_server_url = "https://api.ocp-2211-f9cd.ibm.com:6443/"
storageclass_name = "nfs-storage-provisioner"
web_console_url = "https://console-openshift-console.apps.ocp-2211-f9cd.ibm.com/"
worker_ips = [
  "*.*.*.*",
  "*.*.*.*",
]

b. verify

[core@master-0 ~]$ sudo cryptsetup status root
/dev/mapper/root is inactive.
[core@master-1 ~]$ sudo cryptsetup status root
/dev/mapper/root is inactive.
[core@master-2 ~]$ sudo cryptsetup status root
/dev/mapper/root is inactive.
[core@worker-0 ~]$ sudo cryptsetup status root
/dev/mapper/root is inactive.
[core@worker-1 ~]$ sudo cryptsetup status root
/dev/mapper/root is inactive.
prb112 commented 1 year ago

@gauravpbankar you say you have executed and given the results. But the code is failing to validate itself. Not sure if I can trust the results given. @gauravpbankar @aditijadhav38 please do a complete E2E install using terraform apply with LUKS enabled and disabled, then only ask for review. /cc @prb112

@yussufsh thank you for alerting me. I've sent you a message on the side and I'll work on improving the software engineering and code hygiene.

prb112 commented 1 year ago

@gauravpbankar For Your Action @yussufsh For Your Information

  1. Check the formatting
$ terraform fmt -recursive -check
modules/5_install/5_1_installconfig/installconfig.tf

It shows that the file has a formatting violation.

  1. Run the fix terraform fmt -recursive
diff --git a/modules/5_install/5_1_installconfig/installconfig.tf b/modules/5_install/5_1_installconfig/installconfig.tf
index 5d59d76..d4169a7 100644
--- a/modules/5_install/5_1_installconfig/installconfig.tf
+++ b/modules/5_install/5_1_installconfig/installconfig.tf
@@ -72,7 +72,7 @@ locals {
     service_network            = var.service_network
     # Set CNI network MTU to MTU - 100 for OVNKubernetes and MTU - 50 for OpenShiftSDN(default).
     # Add new conditions here when we have more network providers
-    cni_network_mtu = var.cni_network_provider == "OVNKubernetes" ? var.private_network_mtu - 100 : var.private_network_mtu - 50
+    cni_network_mtu        = var.cni_network_provider == "OVNKubernetes" ? var.private_network_mtu - 100 : var.private_network_mtu - 50
     luks_compliant         = var.luks_compliant
     luks_config            = var.luks_config
     luks_filesystem_device = var.luks_filesystem_device
@@ -83,7 +83,7 @@ locals {
     luks_options           = var.luks_options
     luks_wipeVolume        = var.luks_wipeVolume
     luks_name              = var.luks_name
-}
+  }

Note, I did check tflint and nothing popped up related to the additions.

  1. Run terraform init ... does it complete successfully echo $? should be run after the command, and it should be Zero only.

  2. Run validate abd no-warnings and no-errors.

$ terraform validate -no-color -json
{
  "format_version": "1.0",
  "valid": true,
  "error_count": 0,
  "warning_count": 0,
  "diagnostics": []
}

Execute Step 2... It needs to have proper formatting.

I'm currently running through QA tests.

prb112 commented 1 year ago

@gauravpbankar For Your Action @yussufsh For Your Information

Mea culpa... Gaurav had the changes on two branches. I was on the main branch, and have verified on the luks_encryption.

I am continuing the QE

prb112 commented 1 year ago

I've run through the QE

  1. Without LUKS - I confirmed the current behavior is consistent and it exited cleanly with a working cluster. LUKS is not active.

TF_LOG=debug terraform apply -no-color -var-file=data/var.tfvars -auto-approve -parallelism=2 2>&1 | tee t-apply-no-luks.log

I checked the node status and it is correct.

$ sudo cryptsetup status root /dev/mapper/root is inactive.

The destroy command also worked as expected.

  1. With LUKS - I confirmed the current behavior is LUKS enabled and it exited cleanly with a working cluster. LUKS is active.

I configured the var.tfvars.

luks_compliant              = true
luks_config                 = [ { thumbprint = "********-******", url = "http://9.****:80" } ]

TF_LOG=debug terraform apply -no-color -var-file=data/var.tfvars -auto-approve -parallelism=2 2>&1 | tee t-apply-no-luks.log

I checked the node crypto status and it is correct.

/dev/mapper/root is active and is in use.
  type:    LUKS2
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  key location: keyring
  device:  /dev/sdc4
  sector size:  512
  offset:  32768 sectors
  size:    250826719 sectors
  mode:    read/write

The destroy command also worked as expected.

Prajyot-Parab commented 1 year ago

Thanks @prb112 /lgtm

ppc64le-cloud-bot commented 1 year ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gauravpbankar, Prajyot-Parab, yussufsh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/ocp-power-automation/ocp4-upi-powervm/blob/main/OWNERS)~~ [yussufsh] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment