Closed nagajagan closed 2 years ago
If you see from the error itself, the ingress is taking the FAKE certs, which essentially means that cert were not generated by promenade while installation was done. If ingress is not provided with the valid internal certs generated by below command, the fqdn of ingress will resolve to fake cert and installation will not behave as expected.
mkdir ${NEW_SITE}_certs sudo tools/airship promenade generate-certs \ -o /target/${NEW_SITE}_certs /target/${NEW_SITE}_collected/*.yaml
mkdir -p site/${NEW_SITE}/secrets/certificates sudo cp ${NEW_SITE}_certs/certificates.yaml \ site/${NEW_SITE}/secrets/certificates/certificates.yaml
site/xxxxx/secrets/certificates/ingress.yaml, ingress-crt-site to have following content and that should solve the problem.
-----BEGIN CERTIFICATE----- Ingress Certificates -----END CERTIFICATE----- -----BEGIN CERTIFICATE----- Intermediate Certificate -----END CERTIFICATE----- -----BEGIN CERTIFICATE----- Root certificate -----END CERTIFICATE-----
Just to clarify for future audience. The cert chain is required to be installed in the ingress.yaml. If not properly installed, the call from client to ingress is going to fail with ssl code 21. The error means couldn’t verify the certificate. Please check for public certs in the ingress definition for corresponding services.
Including certificate chain in the ingress.yaml didn't solve the problem of drydock connectivity. It only solved the shipyard connectivity problem.
The dns for drydock should resolve to ingress-nc not ingress-uc starting from 2.7. Please correct the dns entry and you should be able to fix this thing.
After pointing drydock-nc to ingress-nc that is the issue we observe on controller IDRAC consoles while PXE booting. That is not caused by firewall. What default routes do you suggest to change?
Logs from from MaaS GUI
Stdout: start: cmd-install/stage-late/drydock_02/cmd-in-target: curtin command in-target
Running command ['mount', '--bind', '/dev', '/tmp/tmpdi62xy0q/target/dev'] with allowed return codes [0] (capture=False)
Running command ['mount', '--bind', '/proc', '/tmp/tmpdi62xy0q/target/proc'] with allowed return codes [0] (capture=False)
Running command ['mount', '--bind', '/run', '/tmp/tmpdi62xy0q/target/run'] with allowed return codes [0] (capture=False)
Running command ['mount', '--bind', '/sys', '/tmp/tmpdi62xy0q/target/sys'] with allowed return codes [0] (capture=False)
Running command ['unshare', '--help'] with allowed return codes [0] (capture=True)
Running command ['unshare', '--fork', '--pid', '--', 'chroot', '/tmp/tmpdi62xy0q/target', 'wget', '--no-proxy', '--no-check-certificate', '--header=X-Bootaction-Key: ae631ad31b0bdbe53601f4da35375040bac0bc446a245858f2b33d759ae101df', 'https://drydock-nc.att-5gcore.bete.ericy.com/api/v1.0/bootactions/nodes/att5gc18/files', '-O', '/tmp/bootaction-files.tar.gz'] with allowed return codes [0] (capture=False)
--2022-05-10 17:10:52-- https://drydock-nc.att-5gcore.bete.ericy.com/api/v1.0/bootactions/nodes/att5gc18/files
Resolving drydock-nc.att-5gcore.bete.ericy.com (drydock-nc.att-5gcore.bete.ericy.com)... 10.109.84.189
Connecting to drydock-nc.att-5gcore.bete.ericy.com (drydock-nc.att-5gcore.bete.ericy.com)|10.109.84.189|:443... connected.
HTTP request sent, awaiting response... 500 Internal Server Error
2022-05-10 17:12:29 ERROR 500: Internal Server Error.
Running command ['udevadm', 'settle'] with allowed return codes [0] (capture=False)
TIMED subp(['udevadm', 'settle']): 0.010
Running command ['umount', '/tmp/tmpdi62xy0q/target/sys'] with allowed return codes [0] (capture=False)
Running command ['umount', '/tmp/tmpdi62xy0q/target/run'] with allowed return codes [0] (capture=False)
Running command ['umount', '/tmp/tmpdi62xy0q/target/proc'] with allowed return codes [0] (capture=False
Running command ['umount', '/tmp/tmpdi62xy0q/target/dev'] with allowed return codes [0] (capture=False)
finish: cmd-install/stage-late/drydock_02/cmd-in-target: FAIL: curtin command in-target
Stderr: ''
Same service called from curl
root@att5gc20:~# curl --header "X-Bootaction-Key: ae631ad31b0bdbe53601f4da35375040bac0bc446a245858f2b33d759ae101df" https://drydock-nc.att-5gcore.bete.ericy.com/api/v1.0/bootactions/nodes/att5gc18/files
{"title": "Error when running bootaction pipeline segment utf8_decode: AttributeError - 'NoneType' object has no attribute 'decode'"}
We don't see any logging information within drydock pods to find the root cause of this issue.
Initial issue is fixed by adding proper routings in the environment.
https://github.com/airshipit/treasuremap/issues/212#issuecomment-1126219216 is addressed by with the right version of the image for promenade and tested by the reporter.
promenade:
location: https://opendev.org/airship/promenade
- reference: 27f181a9d30294030d695b747b2e4560ffbd29be
+ reference: d161528ae8de0dcb0dd9d39bc370f85f2aa1c462
subpath: charts/promenade
type: git
Describe the bug Installing Drydock Boot Actions.start is failing.
Steps To Reproduce Maintain treasurmap version @ https://github.com/airshipit/treasuremap/commit/2227df4a8d60581974f49501265c0b8230fbf414 and follow the steps to bring up genesis node.
Expected behavior Drydock should complete deployment of nodes.
Environment
Detailed logs within drydock `Installing Drydock Boot Actions.start: cmd-install/stage-late/drydock_01/cmd-in-target: curtin command in-target
Running command ['mount', '--bind', '/dev', '/tmp/tmpt3f8gvqn/target/dev'] with allowed return codes [0] (capture=False) Running command ['mount', '--bind', '/proc', '/tmp/tmpt3f8gvqn/target/proc'] with allowed return codes [0] (capture=False) Running command ['mount', '--bind', '/run', '/tmp/tmpt3f8gvqn/target/run'] with allowed return codes [0] (capture=False) Running command ['mount', '--bind', '/sys', '/tmp/tmpt3f8gvqn/target/sys'] with allowed return codes [0] (capture=False)
Running command ['unshare', '--help'] with allowed return codes [0] (capture=True)Running command ['unshare', '--fork', '--pid', '--', 'chroot', '/tmp/tmpt3f8gvqn/target', 'wget', '--no-proxy', '--no-check-certificate', '--header=X-Bootaction-Key: e27bba27178686a0112252ab215042a4a85a3aa76978be5b2d3cba845c770491', 'https://drydock-nc.att-5gcore.bete.ericy.com/api/v1.0/bootactions/nodes/att5gc19/units', '-O', '/tmp/bootaction-units.tar.gz'] with allowed return codes [0] (capture=False)
--2022-04-07 14:47:04-- https://drydock-nc.att-5gcore.bete.ericy.com/api/v1.0/bootactions/nodes/att5gc19/units
Resolving drydock-nc.att-5gcore.bete.ericy.com (drydock-nc.att-5gcore.bete.ericy.com)... 10.109.82.10
Connecting to drydock-nc.att-5gcore.bete.ericy.com (drydock-nc.att-5gcore.bete.ericy.com)|10.109.82.10|:443... connected. WARNING: cannot verify drydock-nc.att-5gcore.bete.ericy.com's certificate, issued by ‘CN=Kubernetes Ingress Controller Fake Certificate,O=Acme Co’:
Unable to locally verify the issuer's authority.WARNING: no certificate subject alternative name matches
requested host name ‘drydock-nc.att-5gcore.bete.ericy.com’.HTTP request sent, awaiting response... 404 Not Found
2022-04-07 14:47:04 ERROR 404: Not Found.Running command ['udevadm', 'settle'] with allowed return codes [0] (capture=False)TIMED subp(['udevadm', 'settle']): 0.010`