OpenFabrics / fsdp_docs

Other
2 stars 3 forks source link

IPA cert is not being returned during install #117

Closed dledford closed 1 year ago

dledford commented 1 year ago

Normally, an IPA CA cert can be found on the ipa server at https:///ipa/ca.crt. You can download this cert to allow systems to update their root trust and recognize the CA as trusted. This is needed in order to get some versions of the distros to install cleanly. Right now, attempts to access this url are returning a 404 Not Found error.

dledford commented 1 year ago

The system is trying to access the CA cert via: https://beaker.ofa.iol.unh.edu/ipa/ca.crt

dledford commented 1 year ago

More information:

The correct link is http://beaker.ofa.iol.unh.edu/ipa/config/ca.crt

However, that link is returning a 404 error. In fact, all of the items that are supposed to be in that directory and available for download are missing. Normally, this link is redirected by /etc/httpd/conf.d/ipa.conf to /usr/share/ipa/html and the files we need are in that directory. These files are considered the "windows kerberos compatibility files" or something like that, so it's possible that they might not have been installed if you didn't select windows compatibility when installing the IPA server. However, they are needed in order for beaker clients that haven't undergone an ipa-client-install to be able to download the ca.crt for the cluster and update their trust anchors.

Here's what I've done so far:

1) Updated beaker_snippets/per_labcontroller/system_pre/ to put the right link in the fetch command as well as use the -k option to curl to download insecurely over https 2) Updated beaker_snippets/per_osmajor/system/rawhide to define ca_cert to get it to run the snippet in the lab controller directory (not sure about the syntax for this so it might be wrong)

What still needs done:

1) Update the beaker snippets on the lab controller 2) Update the IPA server to properly redirect /ipa/config to /usr/share/ipa/html and make sure that the right files are there. They should be:

[root@arwen html]# ls ca.crt krb5.ini krb.con krbrealm.con ssbrowser.html unauthorized.html [root@arwen html]#

Then we need to test a rawhide install and see if it's working again. The current rawhide installs all fail because attempts to download the repomd files from the beaker lab controller are refused due to have an unknown entity in the CA certificate chain (the IPA server). This appears to be a change to the default SSL policies where the old default was to complain with a warning but allow the transfer to complete, while the new default is to generate an error and refuse to transfer the data without a command line switch to enable insecure data transfers.

lylavoie commented 1 year ago

There are two places to pull the IPA CA certificate from. Both are working.

  1. https://beaker.ofa.iol.unh.edu/ca.crt
  2. http://vpn.ofa.iol.unh.edu/ipa/config/ca.crt

The first is a static copy of the CA certificate we pull onto the beaker server and link to in the github instructions pages, etc. The second is from the IPA server directly.

dledford commented 1 year ago

Ah, OK, so the beaker server is not the IPA server. I didn't realize it is the vpn server instead. I'll update the snippet.

dledford commented 1 year ago

The necessary change has been pushed to the beaker snippets (assuming that %ca_cert is defined anyway)

lylavoie commented 1 year ago

Correct, VPN / IPA server are co-located. However, as previously discussed, after the OFA workshop, I would like to migrate the OFA cluster towards our shared / common VPN and IPA infrastructure used for open source projects. That will change OFA to be supported by a pair of IPA servers (i.e. replicas). Probably best to make the reference to the CA location as an easily maintained variable.

dledford commented 1 year ago

We need to update the frequency of git pull requests of the beaker snippets repo until I get Rawhide working properly. Can we change it to once every 5 minutes for now?

lylavoie commented 1 year ago

Ok, I've updated the cron job to pull once every 5 minutes. Let me know when you're done and we can turn it back down.

dledford commented 1 year ago

@lylavoie I think maybe the cron job is both pulling from the git repo and then copying the data to the beaker location? I say this because a file I removed in git is clearly still in the repo and the existence of that file is preventing things from working. The copy job would probably need to be an rsync job with the --delete option or else the actual git repo would need to be the location that beaker points at in order to make sure file deletes make it across the git pulls/copies.

lylavoie commented 1 year ago

Ok, changed to rsycn job.

dledford commented 1 year ago

@lylavoie I need you to double check things. The kickstart file is still broken, even after all the changes that have been made. So something is going wrong. The problem is that the Fedora Rawhide server install kickstart includes these lines:

# Define ca_cert to trigger the snippet that will update the system ca
# trust anchor database
%define ca_cert

Those lines originally came from beaker_snippets/per_osmajor/system_pre/FedoraRawhide. I removed that file and that directory entirely from the git repo, but the contents are still making it into the kickstart used during the install.

In addition, I changed the contents of the file beaker_snippets/per_lab/system_pre/beaker.ofa.iol.unh.edu to remove the {% if .. } construct and make the code in that file unconditional, but it's not showing up in the kickstart that is being produced.

JSpewock commented 1 year ago

The file beaker_snippets/per_osmajor/system_pre/FedoraRawhide isn't present anymore in the snippets directory so I'm not sure why that would still be showing up and the cron job wasn't running properly and I fixed that as well. After running the rsync again manually however, the "if" statement in the per_lab system_pre is still present in the file and I'm not sure why that would be

JSpewock commented 1 year ago

I fixed the rsync and it should now properly be copying the files over every 5 minutes

dledford commented 1 year ago

Does beaker remake the kickstart on each provision or only on updates to the templates?

dledford commented 1 year ago

OK, beaker snippets updated this time, but now we're back to square 0 where the Rawhide installs don't work because the SSL certificate for the beaker server has a self-signed certificate in the path

JSpewock commented 1 year ago

I can look into that some more. I also noticed conserver had fallen over so the console logs weren't showing. I restarted it and it seems to be working now so hopefully that will also help with debugging

dledford commented 1 year ago

The failure is happening here: 17:51:59,538 DBG packaging: Add the 'beaker-Fedora-Everything' repository (RepoC onfigurationData(cost=100, enabled=True, excluded_packages=[], included_packages =[], installation_enabled=False, name='beaker-Fedora-Everything', origin='USER', proxy='', ssl_configuration=SSLConfigurationData(), ssl_verification_enabled=Tr ue, type='BASEURL', url='http://beaker.ofa.iol.unh.edu/../Everything/x86_64/os') ).

Note that the url is a bare http and not https. This is followed by:

17:51:59,540 DBG dnf: repo: downloading from remote: beaker-Fedora-Everything 17:51:59,595 DBG dnf: error: Curl error (60): SSL peer certificate or SSH remote key was not OK for https://beaker.ofa.iol.unh.edu/Everything/x86_64/os/repodata /repomd.xml [SSL certificate problem: self-signed certificate in certificate cha in] (https://beaker.ofa.iol.unh.edu/Everything/x86_64/os/repodata/repomd.xml).

Note that curl is using https instead of http. I don't know if anaconda is changing that due to the ssl_configuration setting, or if the beaker server has a rewrite rule to force http to https. If it's the latter, then turning that off for everything that is a download URL (so anything not /bkr/ prefixed probably) would likely solve the problem. If it's anaconda rewriting the url, then we need to turn off ssl_verification in the repo configuration for this (and probably other) oses.

dledford commented 1 year ago

FWIW, the Fedora 35 beaker download works, but the ISO image is under the directory distros/ so maybe that directory has https redirect turned off

JSpewock commented 1 year ago

I don't think the beaker server has a directory called Everything to my knowledge. The Rawhide xml is at http://beaker.ofa.iol.unh.edu/distros/Fedora-Server-dvd-x86_64-Rawhide-20230406.n.0/repodata/repomd.xml I believe unless this isn't what you're looking for.

JSpewock commented 1 year ago

Or it looks like in the config on the beaker GUI there is a Fedora-Everything repository listed at the path ../../../Everything/x86_64/os

dledford commented 1 year ago

In the beaker snippets, there is a Fedora-Everything repo listed for all the releases up to Fedora35, plus Rawhide. 36 and 37 don't have it. That lists a metalink that should be valid. But, at least on F35, it's not working.

dledford commented 1 year ago

Probably want that Everything repo in beaker removed and the Everything metalink in the snippets to be used instead

dledford commented 1 year ago

I've pushed updated files to beaker_snippets for the Everything metalink on the extra releases. They should be better now.

JSpewock commented 1 year ago

Fedora 35 doesn't have an everything repo in beaker so I would think that it would only be using the metalink in the snippets

dledford commented 1 year ago

I updated all the snippets to get a metalink to the upstream everything repo as well as the updates repo.

dledford commented 1 year ago

Need to go ahead and remove the Everything repo configured on the web interface for rawhide

I don't see a way to do s from the web interface

dledford commented 1 year ago

BTW, vpn.ofa.iol.unh.edu does not resolve from machines in the cluster

lylavoie commented 1 year ago

vpn.ofa.iol.unh.edu does not exist.  It's ofa-vpn.iol.unh.edu as the actuall VPN service you "hit" before dropping into the cluster.  I will add a CNAME reference in the ofa.iol.unh.edu domain for this, so it will at least resolve.

dledford commented 1 year ago

I was just going by the comment you made about the CA certs ;-).

Also, we need to get the beaker-Fedora-Everything repo removed from the Rawhide distro tree as it is the repo that is failing to resolve due to ssl issues. The regular repo does fine because it's under the /distros directory and is served over http instead of https.

JSpewock commented 1 year ago

I've added a kickstart option that should ignore this repo for now which will allow you to provision Rawhide. I haven't found a way to completely remove it from the repo tree however, I'll have to look more into how to do this

dledford commented 1 year ago

Still didn't work.

lylavoie commented 1 year ago

This additional repo is coming from the .repo file contained in the ISO file being mounted and "does exist" on the Fedora proper repos. I think the issue is, using the ios file to "as the repo" content doesn't work, as the iso and real repo content makes assumptions about the available content at the repo URI / path. One solution might be to actually host / mirror the real repos for Fedora instead of mounting the iso.

JSpewock commented 1 year ago

Previously, we had beaker pointing to an existing mirror and that was how we maintained the latest version. This is why Rawhide stopped working originally and we had to switch to an ISO because that mirror was down. I thought this wouldn't be an issue because we also included an iso as a fallback but I guess it didn't switch to said fall back when the mirror stopped working.

dledford commented 1 year ago

Going back to a mirror would probably be the best solution. It will become more and more necessary once the CKI is in full swing and expecting latest rawhide all the time.

On Thu, Apr 13, 2023 at 2:04 PM Jeremy Spewock @.***> wrote:

Previously, we had beaker pointing to an existing mirror and that was how we maintained the latest version. This is why Rawhide stopped working originally and we had to switch to an ISO because that mirror was down. I thought this wouldn't be an issue because we also included an iso as a fallback but I guess it didn't switch to said fall back when the mirror stopped working.

— Reply to this email directly, view it on GitHub https://github.com/OpenFabrics/fsdp_docs/issues/117#issuecomment-1507402128, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIFQFJW25I5MCRTNVITINTXBA52HANCNFSM6AAAAAAWV53MHE . You are receiving this because you authored the thread.Message ID: @.***>

-- Doug Ledford @.***> GPG KeyID: B826A3330E572FDD Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD

JSpewock commented 1 year ago

I pointed the installation to http://dl.fedoraproject.org/pub/fedora/linux/development/rawhide/Server/x86_64/os/ and I removed the kickstart option for ignoring the Fedora-Everything Repo and it seems to provision. My only concern with this is there is the Fedora-Everything Repo that is being added in the beaker UI and then another that we add through snippets and I wonder if those would conflict.

dledford commented 1 year ago

Probably. We should likely remove the one in the snippets and just use the one in the beaker UI.

JSpewock commented 1 year ago

I removed it from snippets

dledford commented 1 year ago

I'll run a test, thanks

lylavoie commented 1 year ago

@dledford I think this issue is all set now and can be closed, correct?