TOSIT-IO / tdp-collection-extras

Ansible roles to deploy the extra components of TDP
Apache License 2.0
5 stars 13 forks source link

postgres: Connection to master-03:5432 refused #70

Closed jusstol closed 2 years ago

jusstol commented 2 years ago

Hi, I tried to use the most recent revision of this repo to make a fresh new install and now my PostgreSQL clients (like Hive or Ranger) cannot connect to PostgreSQL server. Here is what happens when installing Hive :

TASK [tosit.tdp.common : Hive Metastore initSchema] ************************************************************************************************************************************
fatal: [master-03]: FAILED! => changed=true 
  cmd: |-
    /opt/tdp/hive/bin/hive  --config /etc/hive/conf.ms  --service schemaTool  -dbType postgres  -initSchema
  delta: '0:00:03.546868'
  end: '2022-09-08 12:09:09.077704'
  msg: non-zero return code
  rc: 1
  start: '2022-09-08 12:09:05.530836'
  stderr: |-
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/opt/tdp/apache-hive-3.1.3-TDP-0.1.0-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/tdp/hadoop-3.1.1-TDP-0.1.0-SNAPSHOT/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
    Underlying cause: org.postgresql.util.PSQLException : Connection to master-03.applispfref.sipfref.local:5432 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
    SQL Error code: 0
    Use --verbose for detailed stacktrace.
    *** schemaTool failed ***
  stderr_lines: <omitted>
  stdout: |-
    Metastore connection URL:        jdbc:postgresql://master-03.applispfref.sipfref.local:5432/hive
    Metastore Connection Driver :    org.postgresql.Driver
    Metastore connection User:       hive
  stdout_lines: <omitted>

The role is played on master-03 and tries to reach postgres on the same node which leads to a "Connection refused". If I manually test this (sorry it's in french) I have the same error :

[root@PF9SOBDP047 data]# psql -h master-03.applispfref.sipfref.local -U rangeradmin -d ranger
psql: n'a pas pu se connecter au serveur : Connexion refusée
        Le serveur est-il actif sur l'hôte « master-03.applispfref.sipfref.local » (10.132.149.104)
        et accepte-t-il les connexionsTCP/IP sur le port 5432 ?

But when I try with localhost it works fine :

[root@PF9SOBDP047 data]# psql -h localhost -U rangeradmin -d ranger
Mot de passe pour l'utilisateur rangeradmin : 
psql (9.2.24)
Saisissez « help » pour l'aide.

ranger=> 

pinging works as intended :

[root@PF9SOBDP047 data]# ping master-03.applispfref.sipfref.local
PING master-03.applispfref.sipfref.local (10.132.149.104) 56(84) bytes of data.
64 bytes from master-03.applispfref.sipfref.local (10.132.149.104): icmp_seq=1 ttl=64 time=0.017 ms
64 bytes from master-03.applispfref.sipfref.local (10.132.149.104): icmp_seq=2 ttl=64 time=0.038 ms
64 bytes from master-03.applispfref.sipfref.local (10.132.149.104): icmp_seq=3 ttl=64 time=0.027 ms
^C
--- master-03.applispfref.sipfref.local ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.017/0.027/0.038/0.009 ms

Here is my pg_hba.conf file (replaced IPs with xx) :

#
# Ansible managed
#
# PostgreSQL Client Authentication Configuration File
# ===================================================
#
# See: https://www.postgresql.org/docs/current/static/auth-pg-hba-conf.html

local all postgres    peer 
local all all    peer 
host all all 127.0.0.1/32   md5 
host all all ::1/128   md5 
# BEGIN ANSIBLE MANAGED BLOCK master-02
host all all xx.xx.xx.xx/32 trust
host all all master-02.applispfref.sipfref.local trust
# END ANSIBLE MANAGED BLOCK master-02
# BEGIN ANSIBLE MANAGED BLOCK master-03
host all all xx.xx.xx.xx/32 trust
host all all master-03.applispfref.sipfref.local trust
# END ANSIBLE MANAGED BLOCK master-03

Why can I not interact with PostgreSQL using master-03.applispfref.sipfref.local but only with localhost (when on the same node) ?

jusstol commented 2 years ago

Well I've been struggling with this for the past 2 hours and had to submit this issue to find the fix on my own! There is this new boolean that fixed the issue when set to true : config_tdp_getting_started=true.