uyuni-project / uyuni

Source code for Uyuni
https://www.uyuni-project.org/
GNU General Public License v2.0
428 stars 179 forks source link

Database schema upgrade fails (schema 4.3.6 to 4.3.7) - Uyuni 2022.01 #4858

Open mwilkinson59 opened 2 years ago

mwilkinson59 commented 2 years ago

Question

I've recently ran patches on my Uyuni server and the postgres DB fails to start after rebooting. The service uyuni-check-database.service shows the following:

systemd[1]: Starting Uyuni check database... spacewalk-startup-helper[2070]: Found unfinished schema migration spacewalk-startup-helper[2070]: Try to continue schema migration spacewalk-startup-helper[2070]: Planning to run schema upgrade with dir '/var/log/spacewalk/schema-upgrade/schema-from-20220204-085523' spacewalk-startup-helper[2070]: Executing spacewalk-sql, the log is in [/var/log/spacewalk/schema-upgrade/schema-from-20220204-085523-to-susemanager-schema> spacewalk-startup-helper[2070]: (1/2) apply upgrade [susemanager-schema-4.3.6-to-susemanager-schema-4.3.7/001-centos8-eol.sql.postgresql] spacewalk-startup-helper[2070]: Upgrade failed, please see log [/var/log/spacewalk/schema-upgrade/schema-from-20220204-085523-to-susemanager-schema-4.3.7.l> spacewalk-startup-helper[2070]: You can fix the failed migration in [/var/log/spacewalk/schema-upgrade/schema-from-20220204-085523] spacewalk-startup-helper[2070]: and run spacewalk-schema-upgrade again spacewalk-startup-helper[1993]: Database schema upgrade failed. Please check the logs. systemd[1]: uyuni-check-database.service: Main process exited, code=exited, status=1/FAILURE systemd[1]: uyuni-check-database.service: Failed with result 'exit-code'. systemd[1]: Failed to start Uyuni check database.

In the log, schema-from-20220204-085523-to-susemanager-schema-4.3.7.log, the error is repeated:

psql::42: ERROR: null value in column "label" of relation "rhncontentsource" violates not-null constraint

Is this actually a bug or did something just get messed up during the patching? Either way, wondering how I can either fix the db schema upgrade so that it completes or otherwise get the db back up and running.

Any help is appreciated! Thank you.

mwilkinson59 commented 2 years ago

I logged into the db and selected the data from the 'rhncontentsource' table. Looks like there are 10 rows in that table currently:

uyuni=# SELECT id, label FROM rhncontentsource; id | label -----+--------------------------------------------------------- 500 | Internal - Oracle Linux 8 x86_64 501 | Internal - Oracle Linux 8 AppStream x86_64 503 | Internal - Oracle Linux 7 x86_64 505 | Internal - Addons for Oracle Linux 7 x86_64 502 | External - Uyuni Client Tools for Oracle Linux 8 x86_64 504 | External - Uyuni Client Tools for Oracle Linux 7 x86_64 509 | External - Uyuni Client Tools for Oracle Linux 6 x86_64 510 | Internal - Addons for Oracle Linux 6 x86_64 508 | Internal - Oracle Linux 6 x86_64 506 | Internal - Oracle Linux 6 ELS x86_64 (10 rows)

In the log for this schema upgrade (schema-from-20220204-085523-to-susemanager-schema-4.3.7.log) it seems like it's trying to insert new rows with NULL data into the label column which is why it's throwing the error because the column is set to not-null?

susemanager-schema-4.3.6-to-susemanager-schema-4.3.7/001-centos8-eol.sql.postgresql psql::42: ERROR: null value in column "label" of relation "rhncontentsource" violates not-null constraint DETAIL: Failing row contains (511, null, 500, https://download.opensuse.org/repositories/systemsmanagement:/Uy..., null, N, 2022-02-04 08:55:42.520211-06, 2022-02-04 08:55:42.520211-06). ...

Anyone have an idea how I could edit the schema upgrade sql to not try to insert this NULL data? Or is it confused that I'm using Oracle Linux and not CentOS?

mwilkinson59 commented 2 years ago

As a stop-gap I altered the 'label' column of the 'rhncontentsource' table and set it to drop the "not null" setting. After that, the sql script could process and I was able to get Uyuni services to start up again. Though I have no idea if that's "OK" or not...

juliogonzalez commented 2 years ago

Removing the not null is dangerous, I'd not recommend doing it at all.

That said, we'd need a reproducer to understand what's going on, meaning:

It could be https://github.com/uyuni-project/uyuni/pull/4723 introduced a problem. But since as you can see, I did extensive testing and no failures we found, so this could be a corner case that we can't reproduce without the information requested above (specially because this was not reported by any other users).

mwilkinson59 commented 2 years ago

Thank you for the response! I ended up setting the "not null" back to the column once the sql script processed and I had inserted some data. I'll show you here what Uyuni version and list of channels, repos:

channels/repos:

08:14:51 ======================================
08:14:51 |   Channel Label   |   Repository   |
08:14:51 ======================================
08:14:51 oraclelinux8-x86_64 | https://yumserver.localdomain.com/yum/OracleLinux/OL8/baseos/latest/x86_64/
08:14:51 oraclelinux6-x86_64 | https://yumserver.localdomain.com/yum/OracleLinux/OL6/latest/x86_64/
08:14:51 oraclelinux8-uyuni-client-x86_64 | https://provo-mirror.opensuse.org/repositories/systemsmanagement:/Uyuni:/Stable:/CentOS8-Uyuni-Client-Tools/CentOS_8/
08:14:51 oraclelinux8-appstream-x86_64 | https://yumserver.localdomain.com/yum/OracleLinux/OL8/appstream/x86_64/
08:14:51 oraclelinux6-uyuni-client-x86_64 | https://provo-mirror.opensuse.org/repositories/systemsmanagement:/Uyuni:/Stable:/CentOS6-Uyuni-Client-Tools/CentOS_6/
08:14:51 oraclelinux7-uyuni-client-x86_64 | https://provo-mirror.opensuse.org/repositories/systemsmanagement:/Uyuni:/Stable:/CentOS7-Uyuni-Client-Tools/CentOS_7/
08:14:51 oraclelinux7-x86_64-addons | http://yumserver.localdomain.com/yum/OracleLinux/OL7/addons/x86_64/
08:14:51 oraclelinux7-x86_64 | http://yumserver.localdomain.com/yum/OracleLinux/OL7/latest/x86_64/
08:14:51 oraclelinux6-x86_64-addons | https://yumserver.localdomain.com/yum/OracleLinux/OL6/addons/x86_64/
08:14:51 ol6_x86_64_els | https://yumserver.localdomain.com/yum/OracleLinux/OL6/ELS/x86_64/
08:14:51 ol8-streams-ol8-test-env-oraclelinux8-appstream-x86_64 | No repository set
08:14:51 ol8-streams-ol8-test-env-oraclelinux8-x86_64 | No repository set
08:14:51 ol8-streams-ol8-prod-env-oraclelinux8-x86_64 | No repository set
08:14:51 ol8-streams-ol8-prod-env-oraclelinux8-appstream-x86_64 | No repository set
08:14:51 ol8-streams-ol8-test-env-oraclelinux8-uyuni-client-x86_64 | No repository set
08:14:51 ol8-streams-ol8-prod-env-oraclelinux8-uyuni-client-x86_64 | No repository set

showing I set the not null back:

uyuni=# \d rhncontentsource;
                            Table "public.rhncontentsource"
     Column      |           Type           | Collation | Nullable |      Default
-----------------+--------------------------+-----------+----------+-------------------
 id              | numeric                  |           | not null |
 org_id          | numeric                  |           |          |
 type_id         | numeric                  |           | not null |
 source_url      | character varying(2048)  |           | not null |
 label           | character varying(128)   |           | not null |

Note previously (before the schema upgrade) there were 10 rows that were configured for the rhncontentsource. After the schema upgrade, there's 11 (but I inserted the data into the label for the last row so it wouldn't be null):

uyuni=# SELECT id, label FROM rhncontentsource;
 id  |                          label
-----+---------------------------------------------------------
 500 | Internal - Oracle Linux 8 x86_64
 501 | Internal - Oracle Linux 8 AppStream x86_64
 503 | Internal - Oracle Linux 7 x86_64
 505 | Internal - Addons for Oracle Linux 7 x86_64
 502 | External - Uyuni Client Tools for Oracle Linux 8 x86_64
 504 | External - Uyuni Client Tools for Oracle Linux 7 x86_64
 509 | External - Uyuni Client Tools for Oracle Linux 6 x86_64
 510 | Internal - Addons for Oracle Linux 6 x86_64
 508 | Internal - Oracle Linux 6 x86_64
 506 | Internal - Oracle Linux 6 ELS x86_64
 521 | EL8-Uyuni-Client-Tools
(11 rows)
juliogonzalez commented 2 years ago

@mwilkinson59 I need the Uyuni version before the update.

I understand you updated to 2022.01 from a previous version. That's the one we need.

As for the repositories: one problem I see is that the repositories are not using download.opensuse.org but https://provo-mirror.opensuse.org. Why? Did you add such repositories with spacewalk-common-channels? If not, the migration could fail if things are not according to what spacewalk-common-channels would do.

mwilkinson59 commented 2 years ago

Oh ok. I'm not sure how to determine what the version was before the update. Is there a way I can find out? On the repo side, we have an enterprise internet proxy and download.opensuse.org wasn't working with it. Basically I had to have firewall rules to one particular mirror in order to get the proxy working, so I chose provo-mirror to pin downloads from one mirror.

juliogonzalez commented 2 years ago

Something tells me that's the issue, as https://github.com/uyuni-project/uyuni/pull/4723/files expects download.opensuse.org to handle all the replacements.

Try changing the repository oraclelinux8-uyuni-client-x86_64 for the client tools to use download.opensuse.org, then apply the SQL migration again. After that, change back to that specific mirror.

Needless to say, you should have a backup before doing anything... just in case :-)

mwilkinson59 commented 2 years ago

Yeah that's interesting. I'll keep that in mind. Since I already completed the SQL migration and manually adjusted the DB to make the system work, can I re-run the SQL migration script?

juliogonzalez commented 2 years ago

Since I already completed the SQL migration and manually adjusted the DB to make the system work This is problematic, as I don't have a clue of what's the status of the DB now

If after your manual migration:

Then I think you are in a consistent state, and already with the fix applied to use the EL8 client tools (which are now used for all RHEL8 compatible OS)

mwilkinson59 commented 2 years ago

OK thank you for all your insight. This is a Test server so I may just reinstall it just to be on the safe side. I'll certainly be snapshotting the system in the future before patching :)

Last thing, since you believe having changed the repo from download.opensuse.org to provo-mirror.opensuse.com caused the sql migration to fail, would it be a problem for me to create a feature request to have the DB schema support opensuse mirrors instead of only download.opensuse.org?

juliogonzalez commented 2 years ago

Opening feature requests is free (in both meanings of the word), and I encourage you to open it, so it can be evaluated :-)

However, I think this is not trivial and way bigger than changing the DB schema. It will also involve spacewalk-common-channels (and I can't see how we can easily add this feature keeping mind how it works), which we intend to retire at some point, so mgr-sync and the Setup Wizard can be used and all channels are treated equally for all users.

On the repo side, we have an enterprise internet proxy and download.opensuse.org wasn't working with it.

If I can ask: Why? Because the proxy denied redirections, as download.opensuse.org does? Or because it needs to allow all opensuse mirror hostnames and that's not wanted?

Because if that's the case, it can break other things. And using specific mirrors is always problematic, as they can go up and down at any moment, or can be removed. That's why all distributions encourage users to use the mirrorlists/mirrordirectors.

Not sure if @mcalmer can provide a piece of advice here, as I remember we handle some similar situation in the past, but can't recall what the outcome was.

mwilkinson59 commented 2 years ago

OK thank you for the info, that's very helpful. We also have SUSE Manager on SLES and have experienced an issue where we aren't able to direct or configure the Oracle Linux channels that come from the Admin > Products section to use our own internal repositories.

For example, we have an internal yum server which mirrors all the Oracle ULN (paid subscription) repos/channels every night. This is setup through our FW, it syncs all the packages every night. All good.

Now, when we setup SUSE Manager and needed to use the client "Manager Tools" child channels, we had to use these Oracle Linux products in Admin > Products, however they default to going out to yum.oracle.com. For us, that means we have to configure the SUMA server to also go about to the internet, through the FW and get packages that we already have on-site. So I wish there could be something done there and I think I'll definitely open a feature request for that just in case.

mwilkinson59 commented 2 years ago

If I can ask: Why? Because the proxy denied redirections, as download.opensuse.org does? Or because it needs to allow all opensuse mirror hostnames and that's not wanted?

Yeah basically when going through our proxy, the connections to download.opensuse.org was so slow because the proxy has to scan each and every file being downloaded. For small kb RPMs that's no big deal, but when it comes to the kernel or java or other RPMs that are maybe many 10's of MB in size, the proxy takes a long time to scan all these. It was simply faster and more reliable to define a particular mirror (provo-mirror) and use that through the FW than to have the proxy try to scan everything.

mcalmer commented 2 years ago

Because if that's the case, it can break other things. And using specific mirrors is always problematic, as they can go up and down at any moment, or can be removed. That's why all distributions encourage users to use the mirrorlists/mirrordirectors.

Not sure if @mcalmer can provide a piece of advice here, as I remember we handle some similar situation in the past, but can't recall what the outcome was.

This is not really easy to solv. As said, we need to think about all users in the world and we have to use a kind of mirror redirector. CentOS uses mirrorlist which return you a list of URLs in best order for your location. So you cannot know really know where you will really download from. OpenSUSE send redirects to new URLs. Similar problem.

The only chance is an own mirror server which you configure and take care that all updates are available. Then you export the disk via NFS (or similar) and mount it on the Uyuni/SUSE Manager Server.

You can place also SCC data there to make the Product List work. See https://www.uyuni-project.org/uyuni-docs/en/uyuni/administration/disconnected-setup.html#_synchronize_a_disconnected_server

The only requirement is a structured directory structure. SMT/RMT produce it already for SUSE Products. For other OSes you need to configure it on your own. For Debian like products apt-mirror could be used.

juliogonzalez commented 2 years ago

@mwilkinson59, do we still need this issue?

mwilkinson59 commented 2 years ago

@juliogonzalez Yes, I still believe so but I will read @mcalmer 's response and then i'll respond yet today.

mwilkinson59 commented 2 years ago

So a couple diff questions:

  1. On the RMT front, I'm very interested in setting one of these up. We do this already with a local yum mirror for our Oracle Linux stuff. In particular, we would need one of these for some isolated environments. Is it necessary to use a SLES server for an rmt-server or can an OpenSUSE server work so long as we have SCC credentials.
  2. We switched back to using download.opensuse.org on the uyuni tools channels and were able to get it working. I still believe it'd be useful for users to be able to point the Uyuni repos anywhere they like, but I understand the complexity of it. Is your suggestion to ideally setup an RMT for the Uyuni client tools?
mwilkinson59 commented 2 years ago

Then you export the disk via NFS (or similar) and mount it on the Uyuni/SUSE Manager Server.

Can I not utilize the RMT server as a drop-in replacement for SCC? I.e. do I have to export a disk and mount it on the SUMA server?