uyuni-project / uyuni

Source code for Uyuni
https://www.uyuni-project.org/
GNU General Public License v2.0
430 stars 180 forks source link

mgr-sync-refresh-default task fails to schedule #7326

Open hbrown-uiowa opened 1 year ago

hbrown-uiowa commented 1 year ago

Problem description

The mgr-sync-refresh-default task fails to schedule and no repos are being synced unless we run spacewalk-repo-sync manually.

Unfortunately, it is working fine on our test instance.

Steps to reproduce

  1. I'm not sure how to reproduce it.
  2. ...

Uyuni version

Information for package Uyuni-Server-release:
---------------------------------------------
Repository     : Uyuni Server Stable
Name           : Uyuni-Server-release
Version        : 2023.04-220400.204.2.uyuni2
Arch           : x86_64
Vendor         : obs://build.opensuse.org/systemsmanagement:Uyuni
Support Level  : Level 3
Installed Size : 1.4 KiB
Installed      : Yes
Status         : out-of-date (version 2023.03-220400.199.1.uyuni2 installed)
Source package : Uyuni-Server-release-2023.04-220400.204.2.uyuni2.src
Summary        : Uyuni Server
Description    : 
    Uyuni lets you efficiently manage physical, virtual,
    and cloud-based Linux systems. It provides automated and cost-effective
    configuration and software management, asset management, and system
    provisioning.

Uyuni proxy version (if used)

No response

Useful logs

From rhn_taskomatic_daemon.log

2023-07-28 03:58:01,289 [DefaultQuartzScheduler_Worker-18] ERROR com.redhat.rhn.taskomatic.task.MgrSyncRefresh - Error during mgr-sync refresh
com.redhat.rhn.manager.content.ContentSyncException: No SCC organization credentials found.
        at com.redhat.rhn.manager.content.ContentSyncManager.filterCredentials(ContentSyncManager.java:238) ~[rhn.jar:?]
        at com.redhat.rhn.manager.content.ContentSyncManager.getProducts(ContentSyncManager.java:252) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.task.MgrSyncRefresh.lambda$execute$0(MgrSyncRefresh.java:106) ~[rhn.jar:?]
        at com.redhat.rhn.common.util.FileLocks.lambda$withFileLock$1(FileLocks.java:79) ~[rhn.jar:?]
        at com.redhat.rhn.common.util.FileLocks.withFileLock(FileLocks.java:112) ~[rhn.jar:?]
        at com.redhat.rhn.common.util.FileLocks.withFileLock(FileLocks.java:78) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.task.MgrSyncRefresh.execute(MgrSyncRefresh.java:100) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.task.RhnJavaJob.execute(RhnJavaJob.java:56) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskoJob.doExecute(TaskoJob.java:214) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskoJob.runTask(TaskoJob.java:180) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskoJob.execute(TaskoJob.java:133) ~[rhn.jar:?]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:202) ~[quartz-2.3.0.jar:?]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) ~[quartz-2.3.0.jar:?]
2023-07-28 03:58:01,382 [DefaultQuartzScheduler_Worker-18] DEBUG com.redhat.rhn.taskomatic.task.MgrSyncRefresh - Scheduling synchronization of all vendor channels
2023-07-28 03:58:01,496 [Thread-12] WARN  com.redhat.rhn.taskomatic.TaskoQuartzHelper - Job single-repo-sync-bunch-1 failed to schedule.
2023-07-28 03:58:01,701 [DefaultQuartzScheduler_Worker-18] ERROR com.redhat.rhn.taskomatic.task.MgrSyncRefresh - Executing a task threw an exception: org.quartz.JobExecutionException
org.quartz.JobExecutionException: com.redhat.rhn.taskomatic.TaskomaticApiException: redstone.xmlrpc.XmlRpcException: The response could not be parsed.
        at com.redhat.rhn.taskomatic.task.MgrSyncRefresh.execute(MgrSyncRefresh.java:136) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.task.RhnJavaJob.execute(RhnJavaJob.java:56) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskoJob.doExecute(TaskoJob.java:214) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskoJob.runTask(TaskoJob.java:180) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskoJob.execute(TaskoJob.java:133) ~[rhn.jar:?]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:202) ~[quartz-2.3.0.jar:?]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) ~[quartz-2.3.0.jar:?]
Caused by: com.redhat.rhn.taskomatic.TaskomaticApiException: redstone.xmlrpc.XmlRpcException: The response could not be parsed.
        at com.redhat.rhn.taskomatic.TaskomaticApi.invoke(TaskomaticApi.java:92) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskomaticApi.scheduleSingleRepoSync(TaskomaticApi.java:172) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.task.MgrSyncRefresh.execute(MgrSyncRefresh.java:127) ~[rhn.jar:?]
        ... 6 more
Caused by: redstone.xmlrpc.XmlRpcException: The response could not be parsed.
        at redstone.xmlrpc.XmlRpcClient.handleResponse(XmlRpcClient.java:435) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at redstone.xmlrpc.XmlRpcClient.endCall(XmlRpcClient.java:376) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at redstone.xmlrpc.XmlRpcClient.invoke(XmlRpcClient.java:209) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at com.redhat.rhn.taskomatic.TaskomaticApi.invoke(TaskomaticApi.java:89) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskomaticApi.scheduleSingleRepoSync(TaskomaticApi.java:172) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.task.MgrSyncRefresh.execute(MgrSyncRefresh.java:127) ~[rhn.jar:?]
        ... 6 more
Caused by: redstone.xmlrpc.XmlRpcException: A problem occured during parsing
        at redstone.xmlrpc.XmlRpcParser.parse(XmlRpcParser.java:163) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at redstone.xmlrpc.XmlRpcClient.handleResponse(XmlRpcClient.java:420) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at redstone.xmlrpc.XmlRpcClient.endCall(XmlRpcClient.java:376) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at redstone.xmlrpc.XmlRpcClient.invoke(XmlRpcClient.java:209) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at com.redhat.rhn.taskomatic.TaskomaticApi.invoke(TaskomaticApi.java:89) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskomaticApi.scheduleSingleRepoSync(TaskomaticApi.java:172) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.task.MgrSyncRefresh.execute(MgrSyncRefresh.java:127) ~[rhn.jar:?]
        ... 6 more
Caused by: org.xml.sax.SAXParseException: Premature end of file.
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) ~[xerces-j2.jar:?]
        at redstone.xmlrpc.XmlRpcParser.parse(XmlRpcParser.java:158) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at redstone.xmlrpc.XmlRpcClient.handleResponse(XmlRpcClient.java:420) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at redstone.xmlrpc.XmlRpcClient.endCall(XmlRpcClient.java:376) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at redstone.xmlrpc.XmlRpcClient.invoke(XmlRpcClient.java:209) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at com.redhat.rhn.taskomatic.TaskomaticApi.invoke(TaskomaticApi.java:89) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskomaticApi.scheduleSingleRepoSync(TaskomaticApi.java:172) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.task.MgrSyncRefresh.execute(MgrSyncRefresh.java:127) ~[rhn.jar:?]
        ... 6 more
2023-07-28 03:58:01,706 [DefaultQuartzScheduler_Worker-18] DEBUG com.redhat.rhn.taskomatic.TaskoJob - mgr-sync-refresh (mgr-sync-refresh-default) ... failed
2023-07-28 03:58:01,713 [DefaultQuartzScheduler_Worker-18] INFO  com.redhat.rhn.taskomatic.TaskoJob - Sending e-mail ... mgr-sync-refresh
2023-07-28 03:58:02,088 [DefaultQuartzScheduler_Worker-18] INFO  com.redhat.rhn.taskomatic.TaskoJob - mgr-sync-refresh-default: bunch mgr-sync-refresh-bunch FINISHED
2023-07-28 03:58:02,088 [DefaultQuartzScheduler_Worker-18] DEBUG com.redhat.rhn.taskomatic.TaskoJob - mgr-sync-refresh-default: bunch mgr-sync-refresh-bunch START: 03:58:00 END: 03:58:02

The only calls in the rhn_web_api.log from that time stamp are:
[2023-07-28 03:58:01,205] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.getRepoSyncCronExpression(uyuni-admin, ubuntu-2204-amd64-universe-uyuni) CALLER: (uyuni-admin) TIME: 0.213 seconds
[2023-07-28 03:58:01,225] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.getDetails(uyuni-admin, ubuntu-2204-amd64-universe-updates-uyuni) CALLER: (uyuni-admin) TIME: 0.009 seconds
[2023-07-28 03:58:01,245] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.listChannelRepos(uyuni-admin, ubuntu-2204-amd64-universe-updates-uyuni) CALLER: (uyuni-admin) TIME: 0.012 seconds
[2023-07-28 03:58:01,268] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.access.getOrgSharing(uyuni-admin, ubuntu-2204-amd64-universe-updates-uyuni) CALLER: (uyuni-admin) TIME: 0.019 seconds
[2023-07-28 03:58:01,507] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.getRepoSyncCronExpression(uyuni-admin, ubuntu-2204-amd64-universe-updates-uyuni) CALLER: (uyuni-admin) TIME: 0.235 seconds
[2023-07-28 03:58:01,580] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.getDetails(uyuni-admin, ubuntu-2204-amd64-uyuni-client) CALLER: (uyuni-admin) TIME: 0.056 seconds
[2023-07-28 03:58:01,597] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.listChannelRepos(uyuni-admin, ubuntu-2204-amd64-uyuni-client) CALLER: (uyuni-admin) TIME: 0.009 seconds
[2023-07-28 03:58:01,625] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.access.getOrgSharing(uyuni-admin, ubuntu-2204-amd64-uyuni-client) CALLER: (uyuni-admin) TIME: 0.022 seconds
[2023-07-28 03:58:01,906] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.getRepoSyncCronExpression(uyuni-admin, ubuntu-2204-amd64-uyuni-client) CALLER: (uyuni-admin) TIME: 0.265 seconds
[2023-07-28 03:58:01,928] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.listUserRepos(uyuni-admin) CALLER: (uyuni-admin) TIME: 0.012 seconds
[2023-07-28 03:58:01,966] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.getRepoDetails(uyuni-admin, External - RockyLinux 8 (x86_64)) CALLER: (uyuni-admin) TIME: 0.004 seconds
[2023-07-28 03:58:01,973] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.getRepoDetails(uyuni-admin, External - Rocky Linux 8 - AppStream (x86_64)) CALLER: (uyuni-admin) TIME: 0.002 seconds
[2023-07-28 03:58:01,984] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.getRepoDetails(uyuni-admin, External - Rocky Linux 8 - Extras (x86_64)) CALLER: (uyuni-admin) TIME: 0.007 seconds
[2023-07-28 03:58:01,990] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.getRepoDetails(uyuni-admin, External - Rocky Linux 8 - High Availability (x86_64)) CALLER: (uyuni-admin) TIME: 0.002 seconds
[2023-07-28 03:58:01,998] INFO  - REQUESTED FROM: 127.0.0.1 CALL: channel.software.getRepoDetails(uyuni-admin, External - Rocky Linux 8 - PowerTools (x86_64)) CALLER: (uyuni-admin) TIME: 0.002 seconds

/var/log/tomcat/localhost_access_log.txt
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 708
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 141
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 135
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 2654
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 731
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 141
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 135
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 2494
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 687
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 141
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 135
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 2483
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 676
127.0.0.1 - - [23/Jul/2023:03:58:01 -0500] "POST /rhn/rpc/api HTTP/1.1" 200 141

Additional information

There were two changes that happened during this time. We patched openSUSE Leap 15.4 to the latest OS patches, had issues with salt and downgraded.

We also tried adding a new channel and setting a schedule, the api and spacecmd refused to let us set a schedule for it and some quick web searching made it seem like all repo sync was intended to be done via the mgr-sync-refresh-default task instead. As a result, I stripped all schedules from channels.

I can't seem to find a way to see what the failing api call is or what the response is that isn't getting parsed.

Rebooting the server, restarting all services doesn't seem to make a difference.

hbrown-uiowa commented 1 year ago

With debug turned up for com.redhat.rhn, I get this additional bit:

2023-07-28 11:32:00,639 [DefaultQuartzScheduler_Worker-16] DEBUG com.redhat.rhn.taskomatic.task.MgrSyncRefresh - Scheduling synchronization of all vendor channels
2023-07-28 11:32:00,643 [Thread-3] DEBUG com.redhat.rhn.common.hibernate.DefaultConnectionManager - YYY Opening Hibernate Session
2023-07-28 11:32:00,644 [Thread-3] DEBUG com.redhat.rhn.common.hibernate.DefaultConnectionManager - YYY Opened Hibernate session SessionImpl(56574899<open>)
2023-07-28 11:32:00,646 [Thread-3] DEBUG com.redhat.rhn.common.hibernate.DefaultConnectionManager - YYY Closing Hibernate Session: SessionImpl(56574899<open>)
2023-07-28 11:32:00,648 [DefaultQuartzScheduler_Worker-16] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() called with: tasko_server.host
2023-07-28 11:32:00,648 [DefaultQuartzScheduler_Worker-16] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() -> Getting property: host
2023-07-28 11:32:00,648 [DefaultQuartzScheduler_Worker-16] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() -> result: null
2023-07-28 11:32:00,648 [DefaultQuartzScheduler_Worker-16] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() -> returning: null
2023-07-28 11:32:00,648 [DefaultQuartzScheduler_Worker-16] DEBUG com.redhat.rhn.common.conf.Config - getString() - returning default value
2023-07-28 11:32:00,648 [DefaultQuartzScheduler_Worker-16] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() called with: tasko_server.port
2023-07-28 11:32:00,648 [DefaultQuartzScheduler_Worker-16] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() -> Getting property: port
2023-07-28 11:32:00,648 [DefaultQuartzScheduler_Worker-16] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() -> result: null
2023-07-28 11:32:00,648 [DefaultQuartzScheduler_Worker-16] DEBUG com.redhat.rhn.common.conf.Config - getString() -     getString() -> returning: null
2023-07-28 11:32:00,676 [Thread-4] DEBUG com.redhat.rhn.common.hibernate.DefaultConnectionManager - YYY Opening Hibernate Session
2023-07-28 11:32:00,676 [Thread-4] DEBUG com.redhat.rhn.common.hibernate.DefaultConnectionManager - YYY Opened Hibernate session SessionImpl(1632100143<open>)
2023-07-28 11:32:00,707 [Thread-4] WARN  com.redhat.rhn.taskomatic.TaskoQuartzHelper - Job single-repo-sync-bunch-1 failed to schedule.
avshiliaev commented 1 year ago

@mackdk I'm linking you here cause it seems like an issue already reported and you might be familiar with it :smiley:

hbrown-uiowa commented 10 months ago

This continues to be a problem on 2023.10.

The error message has changed somewhat (from rhn_taskomatic_daemon.log)

2023-11-28 11:32:00,521 [DefaultQuartzScheduler_Worker-70] ERROR com.redhat.rhn.taskomatic.task.MgrSyncRefresh - Executing a task threw an exception: org.quartz.JobExecutionException
org.quartz.JobExecutionException: com.redhat.rhn.taskomatic.TaskomaticApiException: redstone.xmlrpc.XmlRpcFault: org.quartz.ObjectAlreadyExistsException: Unable to store Job : '1.single-repo-sync-bunch-1', because one already exists with this identification.
        at com.redhat.rhn.taskomatic.task.MgrSyncRefresh.execute(MgrSyncRefresh.java:136) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.task.RhnJavaJob.execute(RhnJavaJob.java:56) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskoJob.doExecute(TaskoJob.java:240) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskoJob.runTemplate(TaskoJob.java:193) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskoJob.execute(TaskoJob.java:145) ~[rhn.jar:?]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:202) ~[quartz-2.3.0.jar:?]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) ~[quartz-2.3.0.jar:?]
Caused by: com.redhat.rhn.taskomatic.TaskomaticApiException: redstone.xmlrpc.XmlRpcFault: org.quartz.ObjectAlreadyExistsException: Unable to store Job : '1.single-repo-sync-bunch-1', because one already exists with this identification.
        at com.redhat.rhn.taskomatic.TaskomaticApi.invoke(TaskomaticApi.java:92) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskomaticApi.scheduleSingleRepoSync(TaskomaticApi.java:172) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.task.MgrSyncRefresh.execute(MgrSyncRefresh.java:127) ~[rhn.jar:?]
        ... 6 more
Caused by: redstone.xmlrpc.XmlRpcFault: org.quartz.ObjectAlreadyExistsException: Unable to store Job : '1.single-repo-sync-bunch-1', because one already exists with this identification.
        at redstone.xmlrpc.XmlRpcClient.handleResponse(XmlRpcClient.java:444) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at redstone.xmlrpc.XmlRpcClient.endCall(XmlRpcClient.java:376) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at redstone.xmlrpc.XmlRpcClient.invoke(XmlRpcClient.java:209) ~[redstone-xmlrpc-client-1.1_20071120.jar:?]
        at com.redhat.rhn.taskomatic.TaskomaticApi.invoke(TaskomaticApi.java:89) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.TaskomaticApi.scheduleSingleRepoSync(TaskomaticApi.java:172) ~[rhn.jar:?]
        at com.redhat.rhn.taskomatic.task.MgrSyncRefresh.execute(MgrSyncRefresh.java:127) ~[rhn.jar:?]
        ... 6 more

Is there db cleanup required?

We currently have java.unify_custom_channel_management = 0 set in rhn.conf but I can't set schedules either via the web interface or via spacecmd.