uyuni-project / uyuni

Source code for Uyuni
https://www.uyuni-project.org/
GNU General Public License v2.0
431 stars 180 forks source link

Refreshing Repositories/Channels Fails #5565

Closed kaipons closed 8 months ago

kaipons commented 2 years ago

Problem description

mgr-sync refresh --refresh-channels fails while refreshing repositories/channels.

Version of Uyuni Server and Proxy (if used)

Information for package Uyuni-Server-release:
---------------------------------------------
Repository     : Uyuni Server Stable
Name           : Uyuni-Server-release
Version        : 2022.05-180.1.uyuni1
Arch           : x86_64
Vendor         : obs://build.opensuse.org/systemsmanagement:Uyuni
Support Level  : Level 3
Installed Size : 1.4 KiB
Installed      : Yes
Status         : up-to-date
Source package : Uyuni-Server-release-2022.05-180.1.uyuni1.src
Summary        : Uyuni Server
Description    : 
    Uyuni lets you efficiently manage physical, virtual,
    and cloud-based Linux systems. It provides automated and cost-effective
    configuration and software management, asset management, and system
    provisioning.

Details about the issue

Upon trying to refresh the installed channels on the Uyuni server mgr-sync fails with the following error message:

# mgr-sync refresh --refresh-channels
Refreshing Channel families                    [DONE]
Refreshing SUSE products                       [DONE]
Refreshing SUSE repositories                   [FAIL]
        Error: <Fault -1: 'redstone.xmlrpc.XmlRpcFault: unhandled internal exception: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [com.redhat.rhn.domain.scc.SCCRepositoryNoAuth#5264]'>

The same happens on clicking the Button "Refresh" in "Admin" -> "Setup Wizard" -> "Products" in the Uyuni Web GUI.

/var/log/rhn/mgr-sync.log just contains the same error message:

2022/06/14 11:43:36 +02:00 32988 0.0.0.0: mgr_sync/logger.error("Error: <Fault -1: 'redstone.xmlrpc.XmlRpcFault: unhandled internal exception: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [com.redhat.rhn.domain.scc.SCCRepositoryNoAuth#5264]'>",)
2022/06/14 11:43:36 +02:00 32988 0.0.0.0: mgr_sync/logger.error('Refreshing SUSE repositories failed',)
2022/06/14 11:43:36 +02:00 32988 0.0.0.0: mgr_sync/logger.error("Error: <Fault -1: 'redstone.xmlrpc.XmlRpcFault: unhandled internal exception: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [com.redhat.rhn.domain.scc.SCCRepositoryNoAuth#5264]'>",)

What throws me off is the reference to com.redhat.rhn.domain.scc.SCCRepositoryNoAuth#5264 in the error message. Since I am unable to find the corresponding source code anywhere, I am suspecting that this might be, at least in part, a problem on the remote end (i.e. Suse SCC).

aaannz commented 2 years ago

Can you validate that your installation is correct? Particularly check rpm -qV susemanager-tools.

raulillo82 commented 2 years ago

That was already requested in some email before opening the issue, I found this in the history:

$ rpm -qV susemanager-tools

S.5....T.    /usr/lib/python3.6/site-packages/spacewalk/susemanager/__pycache__/__init__.cpython-36.pyc
S.5....T.    /usr/lib/python3.6/site-packages/spacewalk/susemanager/__pycache__/authenticator.cpython-36.pyc
S.5....T.    /usr/lib/python3.6/site-packages/spacewalk/susemanager/__pycache__/helpers.cpython-36.pyc
S.5....T.    /usr/lib/python3.6/site-packages/spacewalk/susemanager/mgr_sync/__pycache__/__init__.cpython-36.pyc
S.5....T.    /usr/lib/python3.6/site-packages/spacewalk/susemanager/mgr_sync/__pycache__/channel.cpython-36.pyc
S.5....T.    /usr/lib/python3.6/site-packages/spacewalk/susemanager/mgr_sync/__pycache__/cli.cpython-36.pyc
S.5....T.    /usr/lib/python3.6/site-packages/spacewalk/susemanager/mgr_sync/__pycache__/config.cpython-36.pyc
S.5....T.    /usr/lib/python3.6/site-packages/spacewalk/susemanager/mgr_sync/__pycache__/logger.cpython-36.pyc
S.5....T.    /usr/lib/python3.6/site-packages/spacewalk/susemanager/mgr_sync/__pycache__/mgr_sync.cpython-36.pyc
S.5....T.    /usr/lib/python3.6/site-packages/spacewalk/susemanager/mgr_sync/__pycache__/product.cpython-36.pyc
S.5....T.    /usr/lib/python3.6/site-packages/spacewalk/susemanager/mgr_sync/__pycache__/version.cpython-36.pyc
S.5....T.    /usr/share/susemanager/__pycache__/mgr_bootstrap_data.cpython-36.pyc
mbussolotto commented 2 years ago

@kaipons mgr-sync refresh --refresh-channels it's used for refreshing SUSE channels (if you have a subscription), otherwise you should use spacewalk-repo-sync command.. can you please check if this one is working fine? Thanks!

raulillo82 commented 2 years ago

Customer does have valid SUSE subscriptions. Let's wait for their feedback, anyway "mgr-sync refresh" should work, and it does more things than the spacewalk-repo-sync command (imagine that the customer gets new SUSE subscriptions and want SUSE Manager to know about it).

mbussolotto commented 1 year ago

@kaipons , @raulillo82 is the issue still present? Can you please try to run:

mgr-sync -d3 -v refresh --refresh-channels

and provides output and /var/log/rhn/mgr-sync.log ?

Thanks

mbussolotto commented 1 year ago

@kaipons @raulillo82 , any news about it?

sasahodzic commented 1 year ago

Hi to all. I have some problem on version 2022.10. @mbussolotto output from mgr-sync command:

General error: <ProtocolError for server.local.domain.tld:80/rpc/api: 404 Not Found>
Traceback (most recent call last):
  File "/usr/sbin/mgr-sync", line 27, in <module>
    sys.exit(MgrSync().run(options))
  File "/usr/lib/python3.6/site-packages/spacewalk/susemanager/mgr_sync/mgr_sync.py", line 69, in run
    if self.conn.sync.master.hasMaster() and 'refresh' not in vars(options):
  File "/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python3.6/xmlrpc/client.py", line 1452, in __request
    verbose=self.__verbose
  File "/usr/lib64/python3.6/xmlrpc/client.py", line 1154, in request
    return self.single_request(host, handler, request_body, verbose)
  File "/usr/lib64/python3.6/xmlrpc/client.py", line 1187, in single_request
    dict(resp.getheaders())
xmlrpc.client.ProtocolError: <ProtocolError for server.local.domain.tld:80/rpc/api: 404 Not Found>

When I am doing refrresh from GUI everything is ok. Thanks!

mbussolotto commented 1 year ago

/var/log/rhn/mgr-sync.log

@sasahodzic please check this comment https://github.com/uyuni-project/uyuni/issues/5565#issuecomment-1289036840 thanks!

sasahodzic commented 1 year ago

@mbussolotto output: 2022/11/09 13:23:22 +02:00 6804 0.0.0.0: mgr_sync/logger.info("Executing mgr-sync Namespace(debug='3', mirror='', refresh=True, refresh_channels=True, schedule=False, store_credentials=False, subcommands='refresh', verbose=True)",) Thanks!

sasahodzic commented 1 year ago

@mbussolotto Can we guess what is it? Thanks!

mbussolotto commented 1 year ago

@sasahodzic server.local.domain.tld is the correct FQDN of your uyuni server? if not, you can set it on /root/.mgr-sync

sasahodzic commented 1 year ago

@mbussolotto In the file is correct. I can only start command: mgr-sync --version (output is 0.1) and that's all. For everything else I have the same issue as I wrote before: xmlrpc.client.ProtocolError: <ProtocolError for server.local.domain.tld:80/rpc/api: 404 Not Found> What can I check more? Thanks!

mbussolotto commented 1 year ago

@sasahodzic can you please share /var/log/rhn/rhn_web_ui.log and /var/log/rhn/rhn_taskomatic_daemon.log ?

sasahodzic commented 1 year ago

@mbussolotto

rhn_web_ui.log:

2022-11-09 21:40:30,631 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-9] INFO com.redhat.rhn.manager.content.ContentSyncManager - Server not registered at SCC: /etc/zypp/credentials.d/SCCcredentials (No such file or directory) 2022-11-09 21:40:55,938 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-3] INFO com.redhat.rhn.manager.content.ContentSyncManager - getAvailableCHannels took 1.84891541 seconds. 2022-11-09 22:20:29,264 [salt-event-thread-5] WARN com.suse.manager.utils.SaltUtils - No product match found for: sle-module-packagehub-subpackages 15.2 0 x86_64 2022-11-09 22:25:29,798 [salt-event-thread-8] WARN com.suse.manager.utils.SaltUtils - No product match found for: sle-module-packagehub-subpackages 15.2 0 x86_64 2022-11-09 22:25:37,098 [salt-event-thread-8] WARN com.suse.manager.utils.SaltUtils - No product match found for: sle-module-packagehub-subpackages 15.2 0 x86_64 2022-11-10 08:08:56,926 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-10] INFO com.suse.manager.webui.controllers.login.LoginController - LOCAL AUTH SUCCESS: [uyuni] 2022-11-10 08:08:58,936 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-3] INFO com.redhat.rhn.manager.content.ContentSyncManager - getAvailableCHannels took 1.092503901 seconds. 2022-11-10 10:22:00,424 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-10] INFO com.suse.manager.webui.controllers.login.LoginController - LOCAL AUTH SUCCESS: [uyuni] 2022-11-10 10:22:29,542 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-1] INFO com.redhat.rhn.manager.content.ContentSyncManager - getAvailableCHannels took 0.998967834 seconds. 2022-11-10 10:22:29,729 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-9] INFO com.redhat.rhn.manager.content.ContentSyncManager - ContentSyncManager.updateChannelFamilies called 2022-11-10 10:22:30,597 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-9] INFO com.redhat.rhn.manager.content.ContentSyncManager - ContentSyncManager.updateChannelFamilies finished 2022-11-10 10:22:37,764 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-4] INFO com.redhat.rhn.manager.content.ContentSyncManager - ContentSyncManager.updateSUSEProducts called 2022-11-10 10:23:30,622 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-4] INFO com.redhat.rhn.manager.content.ContentSyncManager - ContentSyncManager.updateSUSEProducts finished 2022-11-10 10:23:30,861 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-4] INFO com.redhat.rhn.manager.content.ContentSyncManager - ContentSyncManager.updateRepository called 2022-11-10 10:23:45,052 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-4] INFO com.redhat.rhn.manager.content.ContentSyncManager - ContentSyncManager.updateRepository finished 2022-11-10 10:23:45,112 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-4] INFO com.redhat.rhn.manager.content.ContentSyncManager - ContentSyncManager.getSubscriptions called 2022-11-10 10:23:46,456 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-4] INFO com.redhat.rhn.manager.content.ContentSyncManager - ContentSyncManager.getSubscriptions finished 2022-11-10 10:23:47,536 [ajp-nio-0:0:0:0:0:0:0:1-8009-exec-4] INFO com.redhat.rhn.manager.content.ContentSyncManager - getAvailableCHannels took 0.923771062 seconds.

rhn_taskomatic_daemon.log (just a part but everyting is ok what I see):

2022-11-10 01:19:08,448 [DefaultQuartzScheduler_Worker-77] INFO com.redhat.rhn.taskomatic.task.RepoSyncTask - Syncing repos for channel: SLE-Module-Desktop-Applications15-SP2-Pool for x86_64 HPC 2022-11-10 01:20:02,291 [Thread-2711] INFO com.redhat.rhn.taskomatic.task.RepoSyncTask - 01:19:10 ====================================== 01:19:10 | Channel: sle-module-desktop-applications15-sp2-pool-x86_64-hpc 01:19:10 ====================================== 01:19:10 Sync of channel started. Retrieving repository 'sle-module-desktop-applications15-sp2-pool-x86_64-hpc' metadata [..done] Building repository 'sle-module-desktop-applications15-sp2-pool-x86_64-hpc' cache [....done] All repositories have been refreshed. 01:19:14 Repo URL: https://updates.suse.com/SUSE/Products/SLE-Module-Desktop-Applications/15-SP2/x86_64/product/? 01:19:14 Packages in repo: 4943 01:19:47 No new packages to sync. 01:19:47 01:19:47 Patches in repo: 0. 01:19:47 01:19:47 Importing mediaproducts file products. 01:19:47 *** NOTE: Importing mediaproducts file for the channel 'sle-module-desktop-applications15-sp2-pool-x86_64-hpc'. Previous mediaproducts will be discarded. 01:20:02 Sync completed. 01:20:02 Total time: 0:00:52

2022-11-10 01:20:02,354 [DefaultQuartzScheduler_Worker-77] INFO com.redhat.rhn.taskomatic.task.RepoSyncTask - Syncing repos for channel: SLE-Module-DevTools15-SP2-Updates for x86_64 2022-11-10 01:20:18,053 [Thread-2721] INFO com.redhat.rhn.taskomatic.task.RepoSyncTask - 01:20:03 ====================================== 01:20:03 | Channel: sle-module-devtools15-sp2-updates-x86_64 01:20:03 ====================================== 01:20:03 Sync of channel started. Retrieving repository 'sle-module-devtools15-sp2-updates-x86_64' metadata [.done] Building repository 'sle-module-devtools15-sp2-updates-x86_64' cache [....done] All repositories have been refreshed. 01:20:07 Repo URL: https://updates.suse.com/SUSE/Updates/SLE-Module-Development-Tools/15-SP2/x86_64/update/? 01:20:07 Packages in repo: 1132 01:20:12 No new packages to sync. 01:20:12 01:20:12 Patches in repo: 268. 01:20:15 No new patch to sync. 01:20:17 Sync completed. 01:20:17 Total time: 0:00:13

Something else? Thanks!

mbussolotto commented 1 year ago

@sasahodzic I'd like also to see the output of hostname -f command and the content of /etc/hosts, thanks :)

sasahodzic commented 1 year ago

@mbussolotto

You see everything from hosts file:

IP-Address Full-Qualified-Hostname Short-Hostname

127.0.0.1 localhost

special IPv6 addresses ::1 localhost ipv6-localhost ipv6-loopback

fe00::0 ipv6-localnet

ff00::0 ipv6-mcastprefix ff02::1 ipv6-allnodes ff02::2 ipv6-allrouters ff02::3 ipv6-allhosts uyuni.kbanka.hr uyuni

mbussolotto commented 1 year ago

I don't see server.local.domain.tld entry, this should be the FQDN on the uyuni server (that's the reason why I would check hostname -f )

sasahodzic commented 1 year ago

@mbussolotto I just change it in output from that comment. FQDN is set correct, same from hosts file. Something else is broken on cli (some combination with python and apache) because in GUI refresh of products, channels... works. Spacecmd and spacewalk-remove-channel works and with GUI I can add a channel. Thanks!

mbussolotto commented 1 year ago

Hi @sasahodzic can you please try to upgrade uyuni? I m still not able to reproduce the issue, I think something is misconfigured in your environment, but probably an upgrade might help. Thanks

sasahodzic commented 1 year ago

@mbussolotto I did it. The same issue still exists. In 2022.2 I didn't have this issue. Went from 2 to 10, now 11. Reinstallation of python3-3.6.15-150300.10.30.1.x86_64 is not an option because of dependencies.
Thanks! BR

mbussolotto commented 1 year ago

@sasahodzic , if there's no progress yet, can you please provide me supportconfig? Just run on CLI: supportconfig then please send the log file tar ball created. Thanks!

sasahodzic commented 1 year ago

@mbussolotto sorry, but too many private data in tar file. What would be interesting to check that I can extract? I don't know if supportconfig used with option -m is sufficient. Thanks!

mbussolotto commented 1 year ago

yes please, let's at least try with -m

mbussolotto commented 1 year ago

@sasahodzic please let me know if you can provide the supportconfig (also with -m option). Otherwise there's no way to get what's happening in your system (and the issue is not reproducible). Thanks

sasahodzic commented 1 year ago

@mbussolotto Here is the file: scc_uyuni_230124_1216.zip

Is it posibble to support you on some Uyuni issues?

Thanks!

mbussolotto commented 1 year ago

@mbussolotto Here is the file: scc_uyuni_230124_1216.zip

unfortunately it does not contain any useful information, so I don't know how we can move forward. I would try to reproduce the issue, then check rhn_web_ui.log and rhn_taskomatic_daemon.log for any error ...if there's no error I'd guess something networking misconfiguration (firewall?).

Is it posibble to support you on some Uyuni issues?

Thanks!

Of course :) You can start have a look here: https://www.uyuni-project.org/pages/contact.html, https://github.com/uyuni-project/uyuni/wiki/Contributing or this is the issue list tagged as good first issue https://github.com/uyuni-project/uyuni/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22

mbussolotto commented 1 year ago

unfortunately it does not contain any useful information, so I don't know how we can move forward. I would try to reproduce the issue, then check rhn_web_ui.log and rhn_taskomatic_daemon.log for any error ...if there's no error I'd guess something networking misconfiguration (firewall?).

@sasahodzic any update?

sasahodzic commented 1 year ago

@mbussolotto I didn't find any issue with network. In access_log of apache I have this error: "POST /rpc/api HTTP/1.1" 404 - "-" "Python-xmlrpc/3.6" I created in DocumentRoot rpc/api directory and then I have error 301. Maybe there is some misconfiguration in apache config files. I found something similar: https://access.redhat.com/solutions/363374 but don't have some active subscription.

mbussolotto commented 1 year ago

@sasahodzic can you please provide printenv output?

cbosdo commented 1 year ago

In access_log of apache I have this error: "POST /rpc/api HTTP/1.1" 404 - "-" "Python-xmlrpc/3.6"

/rpc/api/ is defined as a RewriteRule in /etc/apache2/conf.d/zz-spacewalk-www.conf and points to /rhn/rpc/api which is served by tomcat. Could you check that you have a line like RewriteRule ^/rpc/api /rhn/rpc/api in your apache config?

sasahodzic commented 1 year ago

In access_log of apache I have this error: "POST /rpc/api HTTP/1.1" 404 - "-" "Python-xmlrpc/3.6"

/rpc/api/ is defined as a RewriteRule in /etc/apache2/conf.d/zz-spacewalk-www.conf and points to /rhn/rpc/api which is served by tomcat. Could you check that you have a line like RewriteRule ^/rpc/api /rhn/rpc/api in your apache config?

Yes, it's in. Nevertheless, mgr-sync is the only command which is not working from shell. Else is ok, can live without it (spacecmd and spacewalk-remove-channel works and with GUI can add/sync all channels).

cbosdo commented 1 year ago

Could you add LogLevel alert rewrite:trace6 to /etc/apache2/conf.d/zz-spacewalk-www.conf, restart apache and run mgr-sync again? You should have more traces in /var/log/apache2/error_log to check if the rewrite rule works correctly on your machine. And by the way, do not add a folder in DocumentRoot for rpc/api as this would probably will mess up with the rewrite rule too.

sasahodzic commented 1 year ago

Rewrite rule works correctly but issue still exists. There is nothing more in the logs, just 404 in access_log.

ppanon2022 commented 11 months ago

I'm now also having this problem since upgrading to 2023.09 and openSUSE 15.5. I'm actually trying to run /usr/bin/spacewalk-common-channels to set up OpenSUSE Leap 15.5 channels, but running mgr-sync refresh --refresh channels has the same error as what is described in this ticket. As for the above, there is only the "POST /rpc/api HTTP/1.1" 404 - "-" "Python-xmlrpc/3.6" in the Apache access log.

I've found that there seems to have been a similar issue with a completely different product at https://github.com/kiwitcms/tcms-api/issues/9 and they just closed the issue with no solution. So kudos for leaving this issue open and unresolved ;-)

The only other recent change I can think of in terms of system change is running a spacewalk-data-fsck --remove to try to recover space from deleted Ubuntu 18.04 repos, which actually only seemed to clean up packages from CentOS 7 arm channels that had mistakenly been set up during the original server set up. However it seems doubtful that would result in this error.

That said, I think Taskomatic-initiated repo syncs still seem to complete fine, as I see packages that were added in the last week since the upgrade, supported by entries in the sync log.

Could it be some sort of missing or broken python module causing a failure with a misleading error message? While I see references to a python xmlrpclib module for Linux generally in google searches, I can't seem to find what package provides this for openSUSE specifically. Or could it have been installed via pip by an Uyuni install script at some point but a python update broke that install? What should actually be providing the python3 xmlrpc module?

avshiliaev commented 11 months ago

Hey @ppanon2022 thank for your report! Could you please try to restart the spacewalk services with spacewalk-service stop and spacewalk-service start to check if it helps? In the meantime, @vzhestkov do you think there was something similar recently? :smile:

ppanon2022 commented 11 months ago

Actually I have restarted the whole server a couple of times since, with no improvement.

Running spacewalk-common-channels with -vvvv gives some additional information on what is sent and received. What's not clear is whether the error is due to the auth.login method not being found, the xmlrpc server handler, or some other component it depends on. Not sure how to test that.

spacewalk-common-channels -vvvv -u [myadminaccount] -p '[myadminpassword]' -s carmd-nv-uyuni1 'opensuse_leap15_5*' Connecting to http://carmd-nv-uyuni1/rpc/api send: b'POST /rpc/api HTTP/1.1\r\nHost: carmd-nv-uyuni1\r\nAccept-Encoding: gzip\r\nContent-Type: text/xml\r\nUser-Agent: Python-xmlrpc/3.6\r\nContent-Length: 229\r\n\r\n' send: b"<?xml version='1.0'?>\n\nauth.login\n\n\n[myadminaccount]\n\n\n[myadminpassword]\n\n\n\n" reply: 'HTTP/1.1 404 Not Found\r\n' header: Date: Fri, 13 Oct 2023 19:44:27 GMT header: Server: Apache header: X-Frame-Options: SAMEORIGIN header: Content-Length: 315 header: Content-Type: text/html; charset=iso-8859-1 Traceback (most recent call last): File "/usr/bin/spacewalk-common-channels", line 278, in client, key = connect(options.user, options.password, options.server) File "/usr/bin/spacewalk-common-channels", line 120, in connect key = xmlrpc_login(client, options.user, password) File "/usr/lib/python3.6/site-packages/uyuni/common/cli.py", line 74, in xmlrpc_login sessionkey = client.auth.login(username, password) File "/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in call return self.send(self.name, args) File "/usr/lib64/python3.6/xmlrpc/client.py", line 1452, in request verbose=self.verbose File "/usr/lib64/python3.6/xmlrpc/client.py", line 1154, in request return self.single_request(host, handler, request_body, verbose) File "/usr/lib64/python3.6/xmlrpc/client.py", line 1187, in single_request dict(resp.getheaders()) xmlrpc.client.ProtocolError: <ProtocolError for carmd-nv-uyuni1/rpc/api: 404 Not Found>

vzhestkov commented 11 months ago

@avshiliaev @ppanon2022 it reminds me this https://www.suse.com/support/kb/doc/?id=000021103

ppanon2022 commented 11 months ago

Tomcat seems to be running fine and authenticating for the GUI. It's just the spacewalk-common-channels and mgr-sync commands that are failing as far as I can see. spacecmd works fine.

 # systemctl status tomcat
● tomcat.service - Apache Tomcat Web Application Container
     Loaded: loaded (/usr/lib/systemd/system/tomcat.service; enabled; vendor preset: disabled)
    Drop-In: /usr/lib/systemd/system/tomcat.service.d
             └─override.conf
     Active: active (running) since Wed 2023-10-11 16:34:18 PDT; 1 day 21h ago
   Main PID: 1943 (java)
      Tasks: 127 (limit: 576)
     CGroup: /system.slice/tomcat.service
             └─ 1943 /usr/lib64/jvm/java/bin/java -ea -Xms256m -Xmx1G -Djava.awt.headless=true -Dorg.xml.sax.driver=com.redhat.rhn.frontend.xmlrpc.ut>

Oct 12 19:11:58 carmd-nv-uyuni1 unix2_chkpwd[26639]: pam_unix(uyuni:auth): authentication failure; logname= uid=466 euid=0 tty= ruser= rhost=  user=>
Oct 12 19:11:58 carmd-nv-uyuni1 unix2_chkpwd[26639]: pam_sss(uyuni:auth): authentication success; logname= uid=466 euid=0 tty= ruser= rhost= user=>
Oct 13 11:11:02 carmd-nv-uyuni1 unix2_chkpwd[31280]: pam_unix(uyuni:auth): authentication failure; logname= uid=466 euid=0 tty= ruser= rhost=  user=>
Oct 13 11:11:02 carmd-nv-uyuni1 unix2_chkpwd[31280]: pam_sss(uyuni:auth): authentication success; logname= uid=466 euid=0 tty= ruser= rhost= user=>

There are truncated values for user= in that status output but I've edited them out

vzhestkov commented 11 months ago

@ppanon2022 have you tried to do the steps from https://www.suse.com/support/kb/doc/?id=000021103 ?

ppanon2022 commented 11 months ago

I hadn't but, unless you accidentally provided the wrong link, that page indicates that it's for when tomcat doesn't start. tomcat was very clearly starting with no errors. However, I have now checked what the commands in the kb article resulted in and there was no change in the server.xml. So no point in running the commands.

# cp /etc/tomcat/server.xml /etc/tomcat/server.xml.backup
# xsltproc /usr/share/spacewalk/setup/server.xml.xsl /etc/tomcat/server.xml.backup > /etc/tomcat/server.xml.new
# diff -u /etc/tomcat/server.xml.new /etc/tomcat/server.xml
#
ppanon2022 commented 11 months ago

OK, had a moment of clarity and found the problem, at least for spacewalk-common-channels. The latter was going to the http:// interface.

def connect(user, password, server):
    server_url = "http://%s/rpc/api" % server

I changed that to https:// and spacewalk-common-channels now works.

Presumably, the http:// interface got deprecated and disabled for security reasons, but the command was never retested and changed to match. That seems to be a unit test failure.

As for mgr-sync it also appears to also be using http, perhaps on a different port used by taskomatic. Not sure if what is being used in /usr/lib/python3.6/site-packages/spacewalk/susemanager/mgr_sync/mgr_sync.py is TASKOMATIC_XMLRPC_URL = 'http://localhost:2829/RPC2' or

    def __init__(self):
        self.config = Config()
        url = "http://{0}:{1}{2}".format(self.config.host,
                                         self.config.port,
                                         self.config.uri)

but in both cases it's http and likely the same problem. I tried to change the latter to use https:// but that doesn't seem to be sufficient. The port config presumably needs to be changed as well because I then get an SSL error.

ppanon2022 commented 11 months ago

OK, that's definitely the problem. Change mgr-sync.py to use url = "https://{0}:{1}{2}".format(self.config.host, and then edit ~/.mgr-sync to set mgrsync.port = 443 Ideally a PR would also change /usr/lib/python3.6/site-packages/spacewalk/susemanager/mgr_sync/config.py in the __init__ method to self._config[Config.PORT] = 443

We're using signed certs for the Uyuni server, so maybe that would create another problem where you may get an exception that needs to be handled if the server is using a default self-signed cert.

cbosdo commented 11 months ago

Thanks a lot for your investigations! I probably have broken that by forcing HTTPS. I'm currently traveling, I'll try to come up with a PR ASAP.

sasahodzic commented 11 months ago

OK, that's definitely the problem. Change mgr-sync.py to use url = "https://{0}:{1}{2}".format(self.config.host, and then edit ~/.mgr-sync to set mgrsync.port = 443 Ideally a PR would also change /usr/lib/python3.6/site-packages/spacewalk/susemanager/mgr_sync/config.py in the __init__ method to self._config[Config.PORT] = 443

We're using signed certs for the Uyuni server, so maybe that would create another problem where you may get an exception that needs to be handled if the server is using a default self-signed cert.

@ppanon2022 It's working. Thanks a lot!

We are waiting for PR.

cbosdo commented 10 months ago

OK, so it a bit more complex than just forcing HTTPS: we need to use HTTP for localhost since that would not work in a kubernetes container.

cbosdo commented 10 months ago

I still find it strange that you have to use https to access /rhn/rpc/api: this is one of the few URLs that have no forced SSL redirection: https://github.com/uyuni-project/uyuni/blob/master/java/code/src/com/redhat/rhn/frontend/servlets/EnvironmentFilter.java#L48

cbosdo commented 10 months ago

@sasahodzic @ppanon2022 could you check that the PR does the work too? Are you blocking port 80 on your server?

ppanon2022 commented 10 months ago

I don't think we're blocking port 80. I think there's a client redirect for http://server.fqdn/ to https://server.fqdn/rhn/YourRhn.do, but http://server.fqdn/somepaththatworksonhttps just returns a not found error (including when that path is /rhn/rpc/api).

Not Found The requested URL was not found on this server.

Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.

And yes, that's including when using localhost on the server itself

$ curl -Sks http://localhost/rhn/rpc/api
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL was not found on this server.</p>
<p>Additionally, a 404 Not Found
error was encountered while trying to use an ErrorDocument to handle the request.</p>
</body></html>

and it seems to be using the loopback address, although it's not clear if it's using IPv4 or IPv6 nslookup returns 127.0.0.1 but ping does

$ ping localhost
PING localhost(localhost (::1)) 56 data bytes
64 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.085 ms
64 bytes from localhost (::1): icmp_seq=2 ttl=64 time=0.068 ms
64 bytes from localhost (::1): icmp_seq=3 ttl=64 time=0.068 ms
64 bytes from localhost (::1): icmp_seq=4 ttl=64 time=0.068 ms

There's a host conf file for ssl but not http.

# ls /etc/apache2/vhosts.d/
cobbler.conf  vhost-ssl.conf  vhost-ssl.conf-swsave  vhost-ssl.template  vhost.template

The template is there (for listening on *:80) but it was never turned into a conf file. We don't really want an external port 80 service and having users accidentally send auth creds in plain text, so if you want port 80 on localhost, at minimum you would need to change the vhost.template to <VirtualHost 127.0.0.1:80> before making it into a real vhost file. But that doesn't appear to be the standard deployment config at all. If you've changed that for the container config so it can use port 80, then that's OK, but it still has to work on standard O/S installs so you have to change those installations to match on localhost without unnecessarily exposing port 80 on the public NIC.

Why wouldn't port 443 work on K8s containers?

ppanon2022 commented 10 months ago

The PR is a definite improvement, because when the host is set to the fqdn it will work with https. If someone doesn't override the new default host value of localhost, then it wouldn't work in a non-containerized install because that doesn't appear to have a port 80 config. So it should work for our existing environment but I don't know how it would work for a new install.

cbosdo commented 10 months ago

I don't think we're blocking port 80. I think there's a client redirect for http://server.fqdn/ to https://server.fqdn/rhn/YourRhn.do, but http://server.fqdn/somepaththatworksonhttps just returns a not found error (including when that path is /rhn/rpc/api).

There are redirects implemented in the Java code, but not for all URLs and /rhn/rpc/api is one of those not redirected.

$ curl -Sks http://localhost/rhn/rpc/api
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL was not found on this server.</p>
<p>Additionally, a 404 Not Found
error was encountered while trying to use an ErrorDocument to handle the request.</p>
</body></html>

Getting a 404 means that at least it listening on port 80 and answering I'ld say. Also calling curl on the XML-RPC API end point will likely not behave like the tool as it expects POST with xml body, so it may not be the best way to test what happens. For more accurate testing, you could run this command:

curl -vL -X POST -H "Content-Type: text/xml" \
    --data "<?xml version='1.0' encoding='UTF-8'?><methodCall><methodName>api.getVersion</methodName><params></params></methodCall>" \
   http://localhost/rhn/rpc/api

You should get the XML response at the end of the output. The -L will let you see if any redirection to HTTPS is followed. This request should work for both http and https but in the latter case you will need to add the -k parameter since localhost is usually not in the DNS matched by the certificate.

You can also see what is going by adding this to the <Loggers> section in /srv/tomcat/webapps/rhn/WEB-INF/classes/log4j2.xml:

<Logger name="com.redhat.rhn.frontend.servlets.EnvironmentFilter" level="debug" />

Restart tomcat and tail -f /var/log/rhn/rhn_web_ui.log

If the request is redirected to HTTPS by the Java code you will see a line like this one:

[...] DEBUG com.redhat.rhn.frontend.servlets.EnvironmentFilter - redirecting to secure: [...]

You should not have such a line for /rhn/rpc/api. If you do then the first thing to do to fix this issue would be to understand why you have that redirection.

The template is there (for listening on *:80) but it was never turned into a conf file. We don't really want an external port 80 service and having users accidentally send auth creds in plain text, so if you want port 80 on localhost, at minimum you would need to change the vhost.template to <VirtualHost 127.0.0.1:80> before making it into a real vhost file. But that doesn't appear to be the standard deployment config at all. If you've changed that for the container config so it can use port 80, then that's OK, but it still has to work on standard O/S installs so you have to change those installations to match on localhost without unnecessarily exposing port 80 on the public NIC.

You should have the following in zz-spacewalk-server.conf that redirects everything on port 80 to tomcat:

<VirtualHost *>

<IfModule mod_jk.c>
    # Inherit the mod_jk settings defined in zz-spacewalk-www.conf
    JkMountCopy On
</IfModule>

<Directory "/var/www/html/*">
        AllowOverride all
</Directory>

RewriteEngine on
RewriteOptions inherit
</VirtualHost>

Running ss -nlpt | grep ":80 " here shows apache listening on that port.

Why wouldn't port 443 work on K8s containers?

Because on kubernetes the SSL connection is terminated at the ingress level. All communications inside the cluster are using http. This could help leveraging tools like Neuvector for zerotrust feature in the future.