vitabaks / postgresql_cluster

PostgreSQL High-Availability Cluster (based on "Patroni" and DCS "etcd" or "consul"). Automating with Ansible.
MIT License
1.27k stars 340 forks source link

Python 3.11 use on RedHat 8 and above #573

Closed weisscorp closed 2 months ago

weisscorp commented 2 months ago

Hi there!

We've encountered the need to upgrade Python to version 3.11 in our company. I've made some minor code adjustments to ensure compatibility with Python 3.11 by default. Since we don't have Debian servers, I wasn't able to test this. The primary constraint for using Python 3.11 is RedHat 8 and above. Kindly review my commit.

Additionally:

vitabaks commented 2 months ago

@weisscorp Thank you for your contribution.

Could you describe in more detail what is the reason for the need to pin the Python version to 3.11? Why should this be redefined at the postgresql_cluster project level (and therefore for everyone)?

I asked because variables can (and some need to) be redefined to suit your requirements, it doesn't always need to be changed at the project level.

weisscorp commented 2 months ago

In fact, the request may not seem significant, but primarily for security reasons. Patroni in the API gives the version of Python and the version of Patroni itself. Secondly, in order to maintain the current version of Python, versions 3.6 and 3.7 are no longer supported. And security patches are released for 3.8-3.10.

vitabaks commented 2 months ago

@weisscorp I made a minor correction 5a2c6ac5c6c121ad4e391037e1c56371e8932bfd 64b6afa84e5a2e56adbf1573490b171bfc2d8d65

weisscorp commented 2 months ago
    - python{{ python_version }}-libselinux
    - python{{ python_version }}-libsemanage
    - python{{ python_version }}-policycoreutils

I didn’t find python3.11- packages in the system, so I left them in python3-

vitabaks commented 2 months ago

@weisscorp if you have time, try to figure out the cause of the No module named 'dnf error on RHEL 9.

I understand that here we need the full path to dnf tool, or adding the PATH variable to the tasks, but it is not clear why this was needed now with Python 3.11

weisscorp commented 2 months ago

@vitabaks Sure, I'll take a look

weisscorp commented 2 months ago

I couldn't reproduce the problem in the image, so I specified the absolute path. It's worth trying to run the tests again

weisscorp commented 2 months ago

What is this for? I think there is no need to touch the system DNS servers. They are in no way related to patroni.

Add DNS server(s) into /etc/resolv.conf

vitabaks commented 2 months ago

What is this for? I think there is no need to touch the system DNS servers. They are in no way related to patroni.

Without DNS, it will not be possible to download packages from the Internet because there will be an error "could not resolve host".

In this case, the problem is with task " Get epel-release-latest rpm package".

 fatal: [10.172.0.20]: FAILED! => {"changed": false, "dest": "/tmp/", "elapsed": 0, "gid": 0, "group": "root", "mode": "01777", "msg": "Request failed", "owner": "root", "response": "HTTP Error 404: Not Found", "size": 4096, "state": "directory", "status_code": 404, "uid": 0, "url": "https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm"}

  TASK [add-repository : Get epel-release-latest rpm package] ********************
  changed: [10.172.0.21]
  changed: [10.172.0.22]

  NO MORE HOSTS LEFT *************************************************************

It's not related to this PR. I'll check it out.

weisscorp commented 2 months ago

just try restarting the tests, it seems it was a momentary problem related to the update. To avoid this, you should add a retry For example

    - name: Get epel-release-latest rpm package
      ansible.builtin.get_url:
        url: "https://dl.fedoraproject.org/pub/epel/epel-release-latest-{{ ansible_distribution_major_version }}.noarch.rpm"
        dest: /tmp/
        timeout: 30
        validate_certs: false
      when: install_epel_repo|bool
      register: get_epel_package_result
      until: get_epel_package_result is succeeded
      retries: 5
      delay: 10
      tags: install_epel_repo
vitabaks commented 2 months ago

To avoid this, you should add a retry

Feel free to suggest a new PR.

weisscorp commented 2 months ago

I don’t like this error, as if the default Python for the entire system had changed to a new version. It was necessary to change only for patroni. Give me some time so I can review the PR and code

vitabaks commented 2 months ago

cc @weisscorp

I had similar problem on Rocky Linux 9 Docker image (with pre-installed Python 3.9) when installing Python 3.11 from packages. ... Solution: for me helped to replace default #!/usr/bin/python3 shebang in /usr/bin/dnf Python script with a specific one #!/usr/bin/python3.9:

sed -i 's|#!/usr/bin/python3|#!/usr/bin/python3.9|g' /usr/bin/dnf

https://stackoverflow.com/questions/53894712/modulenotfounderror-no-module-named-dnf-when-running-yum-or-dnf

dnf works with python3.9

[root@pgnode01 /]# dnf --version
Traceback (most recent call last):
  File "/usr/bin/dnf", line 61, in <module>
    from dnf.cli import main
ModuleNotFoundError: No module named 'dnf'
[root@pgnode01 /]# 
[root@pgnode01 /]# ls -l /usr/bin/python*
lrwxrwxrwx 1 root root    25 Feb  8 11:50 /usr/bin/python3 -> /etc/alternatives/python3
-rwxr-xr-x 1 root root 15640 Jan 11 22:10 /usr/bin/python3.11
-rwxr-xr-x 1 root root    62 Jan 11 22:10 /usr/bin/python3.11-config
-rwxr-xr-x 1 root root  3584 Jan 11 22:00 /usr/bin/python3.11-x86_64-config
-rwxr-xr-x 1 root root 15448 Dec 12  2022 /usr/bin/python3.9
[root@pgnode01 /]# 
[root@pgnode01 /]# ls -l /etc/alternatives/python3
lrwxrwxrwx 1 root root 19 Feb  8 11:50 /etc/alternatives/python3 -> /usr/bin/python3.11
[root@pgnode01 /]# vim /usr/bin/dnf
[root@pgnode01 /]#  head -1 /usr/bin/dnf
#!/usr/bin/python3
[root@pgnode01 /]# sed -i 's|#!/usr/bin/python3|#!/usr/bin/python3.9|g' /usr/bin/dnf
[root@pgnode01 /]#  head -1 /usr/bin/dnf
#!/usr/bin/python3.9
[root@pgnode01 /]# dnf --version
4.14.0
  Installed: dnf-0:4.14.0-6.el9.noarch at Tue 30 May 2023 01:55:22 PM GMT
  Built    : builder@centos.org at Thu 11 May 2023 12:24:52 PM GMT

  Installed: rpm-0:4.16.1.3-23.el9.x86_64 at Tue 30 May 2023 01:55:20 PM GMT
  Built    : builder@centos.org at Thu 04 May 2023 08:08:32 AM GMT
[root@pgnode01 /]# 
weisscorp commented 2 months ago

@vitabaks I'm not sure exactly what container you're using, but the default [oraclelinux:9] already written correctly.

#!/usr/bin/python3.9

vitabaks commented 2 months ago

I'm not sure exactly what container you're using

glillico/docker-centosstream9-ansible:latest

UPD: it may be worth switching to other images since glillico images have not been updated for more than 6 months (maybe this is the case)

gitpod@weisscorp-postgresqlclu-5n9oqhcrhuy:/workspace/postgresql_cluster$ docker images
REPOSITORY                              TAG       IMAGE ID       CREATED        SIZE
glillico/docker-centosstream9-ansible   latest    0e6977556087   8 months ago   576MB

UPD2: I will add system updates before the testing - https://github.com/vitabaks/postgresql_cluster/pull/575

[root@pgnode01 /]# cat /usr/bin/dnf | head -n 1
#!/usr/bin/python3
[root@pgnode01 /]# dnf update
...
Complete!
[root@pgnode01 /]# 
[root@pgnode01 /]# cat /usr/bin/dnf | head -n 1
#!/usr/bin/python3.9