Altinity / clickhouse-backup

Tool for easy backup and restore for ClickHouse® using object storage for backup files.
https://altinity.com
Other
1.28k stars 226 forks source link

API: /backup/list #902

Closed BoweFlex closed 6 months ago

BoweFlex commented 6 months ago

Trying to understand how /backup/list works - I have a cluster with three clickhouse servers, each of which are running clickhouse-backup as a systemd service with the rest API server. They are configured with the same service file and environment variables pointing them to s3. I have taken a remote backup on the first replica (clickhouse101) and when I list the backups, whether through querying system.backup_list, running clickhouse-backup list, or the API endpoint, I see that new backup. However, on the other two nodes (clickhouse102 and 103) I do not see the new backup. Should the second and third nodes be seeing that backup in s3 or am I misunderstanding?

Slach commented 6 months ago

Which exactly command did you execute to create backup? look to "location" column on first replica

create <backup_name> creates backup locally upload <backup_name> copy local backup to remote storage

create_remote --delete-source <backup-name> will create backup then upload it to remote storage and remove local backup files during uploading

BoweFlex commented 6 months ago

It was created and then uploaded using the following script:

CLICKHOUSE_SERVICES=$(echo "$CLICKHOUSE_SERVICES" | tr "," " ");
BACKUP_DATE=$(date +%Y-%m-%d-%H-%M-%S);
declare -A BACKUP_NAMES;
declare -A DIFF_FROM;
if [[ "" != "$BACKUP_PASSWORD" ]]; then
        BACKUP_PASSWORD="--password=$BACKUP_PASSWORD";
fi;
for SERVER in $CLICKHOUSE_SERVICES; do
    if [[ "1" == "$MAKE_INCREMENT_BACKUP" ]]; then
        LAST_FULL_BACKUP=$(clickhouse-client -q "SELECT name FROM system.backup_list WHERE location='remote' AND name LIKE '%${SERVER}%' AND name LIKE '%full%' AND desc NOT LIKE 'broken%' ORDER BY created DESC LIMIT 1 FORMAT TabSeparatedRaw" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" "$BACKUP_PASSWORD");
        TODAY_FULL_BACKUP=$(clickhouse-client -q "SELECT name FROM system.backup_list WHERE location='remote' AND name LIKE '%${SERVER}%' AND name LIKE '%full%' AND desc NOT LIKE 'broken%' AND toDate(created) = today() ORDER BY created DESC LIMIT 1 FORMAT TabSeparatedRaw" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" "$BACKUP_PASSWORD")
        PREV_BACKUP_NAME=$(clickhouse-client -q "SELECT name FROM system.backup_list WHERE location='remote' AND desc NOT LIKE 'broken%' ORDER BY created DESC LIMIT 1 FORMAT TabSeparatedRaw" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" "$BACKUP_PASSWORD");
        DIFF_FROM[$SERVER]="";
        if [[ ("$FULL_BACKUP_WEEKDAY" == "$(date +%u)" && "" == "$TODAY_FULL_BACKUP") || "" == "$PREV_BACKUP_NAME" || "" == "$LAST_FULL_BACKUP" ]]; then
            BACKUP_NAMES[$SERVER]="full-$BACKUP_DATE";
        else
            BACKUP_NAMES[$SERVER]="increment-$BACKUP_DATE";
            DIFF_FROM[$SERVER]="--diff-from-remote=$PREV_BACKUP_NAME";
        fi
    else
        BACKUP_NAMES[$SERVER]="full-$BACKUP_DATE";
    fi;
    echo "set backup name on $SERVER = ${BACKUP_NAMES[$SERVER]}";
done;
for SERVER in $CLICKHOUSE_SERVICES; do
    echo "create ${BACKUP_NAMES[$SERVER]} on $SERVER";
    clickhouse-client --echo -mn -q "INSERT INTO system.backup_actions(command) VALUES('create ${SERVER}-${BACKUP_NAMES[$SERVER]}')" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" "$BACKUP_PASSWORD";
done;
for SERVER in $CLICKHOUSE_SERVICES; do
    while [[ "in progress" == $(clickhouse-client -mn -q "SELECT status FROM system.backup_actions WHERE command='create ${SERVER}-${BACKUP_NAMES[$SERVER]}' FORMAT TabSeparatedRaw" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" "$BACKUP_PASSWORD") ]]; do
        echo "still in progress ${BACKUP_NAMES[$SERVER]} on $SERVER";
        sleep 1;
    done;
    if [[ "success" != $(clickhouse-client -mn -q "SELECT status FROM system.backup_actions WHERE command='create ${SERVER}-${BACKUP_NAMES[$SERVER]}' FORMAT TabSeparatedRaw" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" "$BACKUP_PASSWORD") ]]; then
        echo "error create ${BACKUP_NAMES[$SERVER]} on $SERVER";
        clickhouse-client -mn --echo -q "SELECT status,error FROM system.backup_actions WHERE command='create ${SERVER}-${BACKUP_NAMES[$SERVER]}'" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" "$BACKUP_PASSWORD";
        exit 1;
    fi;
done;
for SERVER in $CLICKHOUSE_SERVICES; do
    echo "upload ${DIFF_FROM[$SERVER]} ${BACKUP_NAMES[$SERVER]} on $SERVER";
    clickhouse-client --echo -mn -q "INSERT INTO system.backup_actions(command) VALUES('upload ${DIFF_FROM[$SERVER]} ${SERVER}-${BACKUP_NAMES[$SERVER]}')" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" "$BACKUP_PASSWORD";
done;
for SERVER in $CLICKHOUSE_SERVICES; do
    while [[ "in progress" == $(clickhouse-client -mn -q "SELECT status FROM system.backup_actions WHERE command='upload ${DIFF_FROM[$SERVER]} ${SERVER}-${BACKUP_NAMES[$SERVER]}'" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" "$BACKUP_PASSWORD") ]]; do
        echo "upload still in progress ${BACKUP_NAMES[$SERVER]} on $SERVER";
        sleep 5;
    done;
    if [[ "success" != $(clickhouse-client -mn -q "SELECT status FROM system.backup_actions WHERE command='upload ${DIFF_FROM[$SERVER]} ${SERVER}-${BACKUP_NAMES[$SERVER]}'" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" "$BACKUP_PASSWORD") ]]; then
        echo "error ${BACKUP_NAMES[$SERVER]} on $SERVER";
        clickhouse-client -mn --echo -q "SELECT status,error FROM system.backup_actions WHERE command='upload ${DIFF_FROM[$SERVER]} ${SERVER}-${BACKUP_NAMES[$SERVER]}'" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" "$BACKUP_PASSWORD";
        exit 1;
    fi;
    clickhouse-client --echo -mn -q "INSERT INTO system.backup_actions(command) VALUES('delete local ${SERVER}-${BACKUP_NAMES[$SERVER]}')" --host="$SERVER" --port="$CLICKHOUSE_PORT" --user="$BACKUP_USER" "$BACKUP_PASSWORD";
done;
echo "BACKUP CREATED"

And I've confirmed I can see that backup in our s3 storage.

Slach commented 6 months ago

could you share SELECT hostName(), * FROM clusterAllReplicas('your-cluster-name',system.backup_actions) ?

BoweFlex commented 6 months ago

I'm assuming this error is related:

clickhouse :) SELECT hostName(), * FROM clusterAllReplicas('featbit_ch_cluster',system.backup_actions)

SELECT
    hostName(),
    *
FROM clusterAllReplicas('featbit_ch_cluster', system.backup_actions)

Query id: 61fbe28a-912a-4797-a836-d1aa766387c3

    ┌─hostName()──────────────────────┬─command───────────────────────────────────────────────────────────────┬───────────────start─┬──────────────finish─┬─status──┬─error─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
 1. │ l-clickhouse101.wdn.clarkinc.io │ list local                                                            │ 2024-04-23 13:36:45 │ 2024-04-23 13:36:45 │ success │                                                                                                                                                                                                                                                   │
 2. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-23 14:29:05 │ 2024-04-23 14:29:05 │ success │                                                                                                                                                                                                                                                   │
 3. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-23 15:06:42 │ 2024-04-23 15:06:42 │ success │                                                                                                                                                                                                                                                   │
 4. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-23 15:23:25 │ 2024-04-23 15:23:25 │ success │                                                                                                                                                                                                                                                   │
 5. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-23 15:49:32 │ 2024-04-23 15:49:33 │ success │                                                                                                                                                                                                                                                   │
 6. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-23 16:20:31 │ 2024-04-23 16:20:32 │ success │                                                                                                                                                                                                                                                   │
 7. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-24 14:51:46 │ 2024-04-24 14:51:46 │ success │                                                                                                                                                                                                                                                   │
 8. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-26 09:12:15 │ 2024-04-26 09:12:15 │ success │                                                                                                                                                                                                                                                   │
 9. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-26 11:33:12 │ 2024-04-26 11:33:12 │ success │                                                                                                                                                                                                                                                   │
10. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-26 11:33:12 │ 2024-04-26 11:33:12 │ success │                                                                                                                                                                                                                                                   │
11. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-26 11:33:13 │ 2024-04-26 11:33:13 │ success │                                                                                                                                                                                                                                                   │
12. │ l-clickhouse101.wdn.clarkinc.io │ create l-clickhouse101.wdn.clarkinc.io-full-2024-04-26-15-33-11       │ 2024-04-26 11:33:13 │ 2024-04-26 11:33:14 │ error   │ one of createBackupLocal go-routine return error: can't freeze table: code: 497, message: backup: Not enough privileges. To execute this query, it's necessary to have the grant ALTER FREEZE PARTITION ON featbit.infi_clickhouse_orm_migrations │
13. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-26 12:11:33 │ 2024-04-26 12:11:34 │ success │                                                                                                                                                                                                                                                   │
14. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-26 12:11:34 │ 2024-04-26 12:11:34 │ success │                                                                                                                                                                                                                                                   │
15. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-26 12:11:34 │ 2024-04-26 12:11:34 │ success │                                                                                                                                                                                                                                                   │
16. │ l-clickhouse101.wdn.clarkinc.io │ create l-clickhouse101.wdn.clarkinc.io-full-2024-04-26-16-11-33       │ 2024-04-26 12:11:34 │ 2024-04-26 12:11:34 │ success │                                                                                                                                                                                                                                                   │
17. │ l-clickhouse101.wdn.clarkinc.io │ upload  l-clickhouse101.wdn.clarkinc.io-full-2024-04-26-16-11-33      │ 2024-04-26 12:11:36 │ 2024-04-26 12:11:36 │ success │                                                                                                                                                                                                                                                   │
18. │ l-clickhouse101.wdn.clarkinc.io │ delete local l-clickhouse101.wdn.clarkinc.io-full-2024-04-26-16-11-33 │ 2024-04-26 12:11:42 │ 2024-04-26 12:11:42 │ success │                                                                                                                                                                                                                                                   │
19. │ l-clickhouse101.wdn.clarkinc.io │ list                                                                  │ 2024-04-26 12:13:19 │ 2024-04-26 12:13:19 │ success │                                                                                                                                                                                                                                                   │
20. │ l-clickhouse101.wdn.clarkinc.io │ create                                                                │ 2024-04-26 13:06:01 │ 2024-04-26 13:06:01 │ success │                                                                                                                                                                                                                                                   │
    └─────────────────────────────────┴───────────────────────────────────────────────────────────────────────┴─────────────────────┴─────────────────────┴─────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
→ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.)
20 rows in set. Elapsed: 0.020 sec.

Received exception from server (version 24.3.2):
Code: 516. DB::Exception: Received from localhost:9000. DB::Exception: Received from l-clickhouse102.wdn.clarkinc.io:9000. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.
BoweFlex commented 6 months ago

I have assigned the default user to be readonly, would that cause this?

Slach commented 6 months ago

Code: 516. DB::Exception: Received from localhost:9000. DB::Exception: Received from l-clickhouse102.wdn.clarkinc.io:9000. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.

it means your clickhouse nodes can't connect to each other for distributed queries it should not affect to backup but better fix it

do you use inside ?

Slach commented 6 months ago

is SELECT * FROM system.backup_list on clickhouse102 contains l-clickhouse101.wdn.clarkinc.io-full-2024-04-26-16-11-33 ?

BoweFlex commented 6 months ago

I get a similar DB::Exception from running SELECT * FROM system.backup_list. I still get that error after updating our users.yml to not modify the default user to be readonly, so it should have standard permissions.

Our remote_servers is defined in yaml, and is:

remote_servers:
  featbit_ch_cluster:
    shard:
      replica:
        host: l-clickhouse101.wdn.clarkinc.io
        port: 9000
    shard:
      replica:
        host: l-clickhouse102.wdn.clarkinc.io
        port: 9000
    shard:
      replica:
        host: l-clickhouse103.wdn.clarkinc.io
        port: 9000
Slach commented 6 months ago

Shared configuration fragment means you are using default user without password during execute queries

try to change your config

remote_servers:
  featbit_ch_cluster:
    secret: just_string 

in this case, distributed queries will pass security context as initial user which run query in initial node

BoweFlex commented 6 months ago

I'm not sure if I'm doing something wrong - I added that secret: line to the config for all three servers and restarted all of them, but am still receiving that exception when I query system.backup_list

Slach commented 6 months ago

how did you run clickhouse-client ? just clickhouse-client without --user and --password ?

BoweFlex commented 6 months ago

No I ran with a user and password I've defined in users.d/users.yml. I am unable to sign in in with just clickhouse-client for some reason:

[jbowe@l-clickhouse102 ~](DEV)$ clickhouse-client
ClickHouse client version 24.3.2.23 (official build).
Connecting to localhost:9000 as user default.
Code: 516. DB::Exception: Received from localhost:9000. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.

If you have installed ClickHouse and forgot password you can reset it in the configuration file.
The password for default user is typically located at /etc/clickhouse-server/users.d/default-password.xml
and deleting this file will reset the password.
See also /etc/clickhouse-server/users.xml on the server where ClickHouse is installed.

. (AUTHENTICATION_FAILED)

But I have confirmed that the default user should currently be set up as normal, and I have no default-password.xml defined.

Slach commented 6 months ago

ok. looks like you hardened your default user

could you share from clickhouse102

grep default -C 20 /var/lib/clickhouse/preprocessed_configs/users.xml

without sensitive data?

BoweFlex commented 6 months ago

Here are the results from that command, minus the password hashes:

<!-- This file was generated automatically.
     Do not edit it: it is likely to be discarded and generated again before it's read next time.
     Files used to generate this file:
       /etc/clickhouse-server/users.xml
       /etc/clickhouse-server/users.d/roles.yml
       /etc/clickhouse-server/users.d/users.yml      -->

<clickhouse>
    <!-- See also the files in users.d directory where the settings can be overridden. -->

    <!-- Profiles of settings. -->
    <profiles>
        <!-- Default settings. -->
        <default>
        </default>

        <!-- Profile that allows only read queries. -->
        <readonly>
            <readonly>1</readonly>
        </readonly>
    </profiles>

    <!-- Users and ACL. -->
    <users>
        <!-- If user name was not specified, 'default' user is used. -->
        <default>
            <!-- See also the files in users.d directory where the password can be overridden.

                 Password could be specified in plaintext or in SHA256 (in hex format).

                 If you want to specify password in plaintext (not recommended), place it in 'password' element.
                 Example: <password>qwerty</password>.
                 Password could be empty.

                 If you want to specify SHA256, place it in 'password_sha256_hex' element.
                 Example: <password_sha256_hex>65e84be33532fb784c48129675f9eff3a682b27168c0ea744b2cf58ee02337c5</password_sha256_hex>
                 Restrictions of SHA256: impossibility to connect to ClickHouse using MySQL JS client (as of July 2019).

                 If you want to specify double SHA1, place it in 'password_double_sha1_hex' element.
                 Example: <password_double_sha1_hex>e395796d6546b1b65db9d665cd43f0e858dd4303</password_double_sha1_hex>

                 If you want to specify a previously defined LDAP server (see 'ldap_servers' in the main config) for authentication,
                  place its name in 'server' element inside 'ldap' element.
                 Example: <ldap><server>my_ldap_server</server></ldap>

                 If you want to authenticate the user via Kerberos (assuming Kerberos is enabled, see 'kerberos' in the main config),
--
                 To open access only from localhost, specify:
                    <ip>::1</ip>
                    <ip>127.0.0.1</ip>

                 Each element of list has one of the following forms:
                 <ip> IP-address or network mask. Examples: 213.180.204.3 or 10.0.0.1/8 or 10.0.0.1/255.255.255.0
                     2a02:6b8::3 or 2a02:6b8::3/64 or 2a02:6b8::3/ffff:ffff:ffff:ffff::.
                 <host> Hostname. Example: server01.clickhouse.com.
                     To check access, DNS query is performed, and all received addresses compared to peer address.
                 <host_regexp> Regular expression for host names. Example, ^server\d\d-\d\d-\d\.clickhouse\.com$
                     To check access, DNS PTR query is performed for peer address and then regexp is applied.
                     Then, for result of PTR query, another DNS query is performed and all received addresses compared to peer address.
                     Strongly recommended that regexp is ends with $
                 All results of DNS requests are cached till server restart.
            -->
            <networks>
                <ip>::/0</ip>
            </networks>

            <!-- Settings profile for user. -->
            <profile>default</profile>

            <!-- Quota for user. -->
            <quota>default</quota>

            <!-- User can create other users and grant rights to them. -->
            <access_management>1</access_management>

            <!-- User can manipulate named collections. -->
            <named_collection_control>1</named_collection_control>

            <!-- User permissions can be granted here -->
            <!--
            <grants>
                <query>GRANT ALL ON *.*</query>
            </grants>
            -->
        </default>
    <clickhouse-admin>
            <password_sha256_hex></password_sha256_hex>
            <networks>
                <ip>::/0</ip>
            </networks>
            <quota>default</quota>
            <grants>
                <query>grant sysadmin</query>
            </grants>
        </clickhouse-admin>
        <featbit>
            <password_sha256_hex></password_sha256_hex>
            <networks>
                <ip>::/0</ip>
            </networks>
            <quota>default</quota>
            <grants>
                <query>grant featbit_admin</query>
            </grants>
        </featbit>
        <backup>
            <password_sha256_hex></password_sha256_hex>
            <networks>
                <ip>::/0</ip>
            </networks>
            <quota>default</quota>
            <grants>
                <query>grant backup</query>
            </grants>
        </backup>
    </users>

    <!-- Quotas. -->
    <quotas>
        <!-- Name of quota. -->
        <default>
            <!-- Limits for time interval. You could specify many intervals with different limits. -->
            <interval>
                <!-- Length of interval. -->
                <duration>3600</duration>

                <!-- No limits. Just calculate resource usage for time interval. -->
                <queries>0</queries>
                <errors>0</errors>
                <result_rows>0</result_rows>
                <read_rows>0</read_rows>
                <execution_time>0</execution_time>
            </interval>
        </default>
    </quotas>
<roles>
        <sysadmin>
            <profile>default</profile>
            <grants>
                <query>GRANT ALL ON *.*</query>
            </grants>
        </sysadmin>
        <backup>
            <profile>default</profile>
            <grants>
                <query>GRANT CREATE TABLE, CREATE DATABASE, INSERT, BACKUP, URL, ALTER FREEZE PARTITION ON *.*</query>
                <query>GRANT ALTER, CREATE, DROP, SELECT ON system.*</query>
            </grants>
        </backup>
        <featbit_admin>
            <grants>
                <query>GRANT SELECT, INSERT, ALTER, CREATE, DROP, TRUNCATE, OPTIMIZE, SHOW, dictGet ON featbit.*</query>
                <query>GRANT SOURCES ON *.*</query>
            </grants>
        </featbit_admin>
        <featbit_readwrite>
            <grants>
                <query>GRANT SELECT, INSERT, DROP, SHOW, dictGet ON featbit.*</query>
            </grants>
        </featbit_readwrite>
        <featbit_readonly>
            <grants>
                <query>GRANT SELECT, SHOW, dictGet ON featbit.*</query>
            </grants>
Slach commented 6 months ago
::/0

do you have ipv6 cluster?

nslookup l-clickhouse101.wdn.clarkinc.io
nslookup l-clickhouse102.wdn.clarkinc.io
nslookup l-clickhouse103.wdn.clarkinc.io

there is public IP or private IP for this DNS names?

BoweFlex commented 6 months ago

These DNS names correspond with private IPs.

IPv6 is probably enabled but unused as far as I know

Slach commented 6 months ago

lets add /etc/clickhouse-server/users.d/default_ip_private.xml

with following content

<clickhouse>
<users><default>
  <networks>
    <!-- your private IP range in CIDR format -->
    <ip>X.X.X.X/X</ip>
  </networks>
</default></users>
</clickhouse>
BoweFlex commented 6 months ago

I added that file, and still receive the same AUTHENTICATION_FAILED message

Slach commented 6 months ago

did you add this config on all 3 hosts?

Slach commented 6 months ago

could you clarify, yes or no?

BoweFlex commented 6 months ago

Yes, that file was added on all three hosts and then the clickhouse-server service was restarted

Slach commented 6 months ago

could you share result of? grep -C 10 "<ip>" -r /var/lib/clickhouse/prerpocessed_configs/

without sensitive data?

BoweFlex commented 6 months ago

I tried running that command on each of the three servers three times, once for each of their IP addresses. I didn't receive any results from any of the commands.

Slach commented 6 months ago

sorry typo preprocessed instead of prerpocessed

grep -C 10 "<ip>" -r /var/lib/clickhouse/preprocessed_configs/
BoweFlex commented 6 months ago

I did notice that and fixed the typo in the command I ran

Slach commented 6 months ago

and what is the result? empty? are you sure you run command in properly place?

could you share SELECT * FROM system.disks result?

BoweFlex commented 6 months ago

Yes I get empty results:

[jbowe@l-clickhouse102 ~](DEV)$ sudo grep -C 10 "<103 ip>" -r /var/lib/clickhouse/preprocessed_configs/
[jbowe@l-clickhouse102 ~](DEV)$ sudo grep -C 10 "<102 ip>" -r /var/lib/clickhouse/preprocessed_configs/
[jbowe@l-clickhouse102 ~](DEV)$ sudo grep -C 10 "<101 ip>" -r /var/lib/clickhouse/preprocessed_configs/
[jbowe@l-clickhouse102 ~](DEV)$

The results from SELECT * FROM system.disks are pretty much identical from each server:

SELECT *
FROM system.disks
FORMAT vertical

Query id: a0a1e1e8-e651-4b2c-be94-2be325a48443

Row 1:
──────
name:                default
path:                /var/lib/clickhouse/
free_space:          90396631040
total_space:         105752821760
unreserved_space:    90396631040
keep_free_space:     0
type:                Local
object_storage_type: None
metadata_type:       None
is_encrypted:        0
is_read_only:        0
is_write_once:       0
is_remote:           0
is_broken:           0
cache_path:

1 row in set. Elapsed: 0.002 sec.
Slach commented 6 months ago

=)))

i just want to see your XML tag with <ip> name this is not substitution

so run as is grep -C 10 "<ip>" -r /var/lib/clickhouse/preprocessed_configs/

and compare results from all 3 nodes

and share SELECT * FROM system.clusters

BoweFlex commented 6 months ago

Sorry about that. Results from all three servers are identical, and are as follows:

/var/lib/clickhouse/preprocessed_configs/users.xml-
/var/lib/clickhouse/preprocessed_configs/users.xml-                 How to generate double SHA1:
/var/lib/clickhouse/preprocessed_configs/users.xml-                 Execute: PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha1sum | tr -d '-' | xxd -r -p | sha1sum | tr -d
'-'
/var/lib/clickhouse/preprocessed_configs/users.xml-                 In first line will be password and in second - corresponding double SHA1.
/var/lib/clickhouse/preprocessed_configs/users.xml-            -->
/var/lib/clickhouse/preprocessed_configs/users.xml-            <password/>
/var/lib/clickhouse/preprocessed_configs/users.xml-
/var/lib/clickhouse/preprocessed_configs/users.xml-            <!-- List of networks with open access.
/var/lib/clickhouse/preprocessed_configs/users.xml-
/var/lib/clickhouse/preprocessed_configs/users.xml-                 To open access from everywhere, specify:
/var/lib/clickhouse/preprocessed_configs/users.xml:                    <ip>::/0</ip>
/var/lib/clickhouse/preprocessed_configs/users.xml-
/var/lib/clickhouse/preprocessed_configs/users.xml-                 To open access only from localhost, specify:
/var/lib/clickhouse/preprocessed_configs/users.xml:                    <ip>::1</ip>
/var/lib/clickhouse/preprocessed_configs/users.xml:                    <ip>127.0.0.1</ip>
/var/lib/clickhouse/preprocessed_configs/users.xml-
/var/lib/clickhouse/preprocessed_configs/users.xml-                 Each element of list has one of the following forms:
/var/lib/clickhouse/preprocessed_configs/users.xml:                 <ip> IP-address or network mask. Examples: 213.180.204.3 or 10.0.0.1/8 or 10.0.0.1/255.255.255.0
/var/lib/clickhouse/preprocessed_configs/users.xml-                     2a02:6b8::3 or 2a02:6b8::3/64 or 2a02:6b8::3/ffff:ffff:ffff:ffff::.
/var/lib/clickhouse/preprocessed_configs/users.xml-                 <host> Hostname. Example: server01.clickhouse.com.
/var/lib/clickhouse/preprocessed_configs/users.xml-                     To check access, DNS query is performed, and all received addresses compared to peer address.
/var/lib/clickhouse/preprocessed_configs/users.xml-                 <host_regexp> Regular expression for host names. Example, ^server\d\d-\d\d-\d\.clickhouse\.com$
/var/lib/clickhouse/preprocessed_configs/users.xml-                     To check access, DNS PTR query is performed for peer address and then regexp is applied.
/var/lib/clickhouse/preprocessed_configs/users.xml-                     Then, for result of PTR query, another DNS query is performed and all received addresses compared to peer address.
/var/lib/clickhouse/preprocessed_configs/users.xml-                     Strongly recommended that regexp is ends with $
/var/lib/clickhouse/preprocessed_configs/users.xml-                 All results of DNS requests are cached till server restart.
/var/lib/clickhouse/preprocessed_configs/users.xml-            -->
/var/lib/clickhouse/preprocessed_configs/users.xml-            <networks>
/var/lib/clickhouse/preprocessed_configs/users.xml:                <ip>X.X.X.0/24</ip>
/var/lib/clickhouse/preprocessed_configs/users.xml-
/var/lib/clickhouse/preprocessed_configs/users.xml-                           <!-- your private IP range in CIDR format -->
/var/lib/clickhouse/preprocessed_configs/users.xml-
/var/lib/clickhouse/preprocessed_configs/users.xml-                             </networks>
/var/lib/clickhouse/preprocessed_configs/users.xml-
/var/lib/clickhouse/preprocessed_configs/users.xml-            <!-- Settings profile for user. -->
/var/lib/clickhouse/preprocessed_configs/users.xml-            <profile>default</profile>
/var/lib/clickhouse/preprocessed_configs/users.xml-
/var/lib/clickhouse/preprocessed_configs/users.xml-            <!-- Quota for user. -->
/var/lib/clickhouse/preprocessed_configs/users.xml-            <quota>default</quota>
--
/var/lib/clickhouse/preprocessed_configs/users.xml-            <grants>
/var/lib/clickhouse/preprocessed_configs/users.xml-                <query>GRANT ALL ON *.*</query>
/var/lib/clickhouse/preprocessed_configs/users.xml-            </grants>
/var/lib/clickhouse/preprocessed_configs/users.xml-            -->
/var/lib/clickhouse/preprocessed_configs/users.xml-
/var/lib/clickhouse/preprocessed_configs/users.xml-
/var/lib/clickhouse/preprocessed_configs/users.xml-             </default>
/var/lib/clickhouse/preprocessed_configs/users.xml-    <clickhouse-admin>
/var/lib/clickhouse/preprocessed_configs/users.xml-            <password_sha256_hex></password_sha256_hex>
/var/lib/clickhouse/preprocessed_configs/users.xml-            <networks>
/var/lib/clickhouse/preprocessed_configs/users.xml:                <ip>::/0</ip>
/var/lib/clickhouse/preprocessed_configs/users.xml-            </networks>
/var/lib/clickhouse/preprocessed_configs/users.xml-            <quota>default</quota>
/var/lib/clickhouse/preprocessed_configs/users.xml-            <grants>
/var/lib/clickhouse/preprocessed_configs/users.xml-                <query>grant sysadmin</query>
/var/lib/clickhouse/preprocessed_configs/users.xml-            </grants>
/var/lib/clickhouse/preprocessed_configs/users.xml-        </clickhouse-admin>
/var/lib/clickhouse/preprocessed_configs/users.xml-        <featbit>
/var/lib/clickhouse/preprocessed_configs/users.xml-            <password_sha256_hex></password_sha256_hex>
/var/lib/clickhouse/preprocessed_configs/users.xml-            <networks>
/var/lib/clickhouse/preprocessed_configs/users.xml:                <ip>::/0</ip>
/var/lib/clickhouse/preprocessed_configs/users.xml-            </networks>
/var/lib/clickhouse/preprocessed_configs/users.xml-            <quota>default</quota>
/var/lib/clickhouse/preprocessed_configs/users.xml-            <grants>
/var/lib/clickhouse/preprocessed_configs/users.xml-                <query>grant featbit_admin</query>
/var/lib/clickhouse/preprocessed_configs/users.xml-            </grants>
/var/lib/clickhouse/preprocessed_configs/users.xml-        </featbit>
/var/lib/clickhouse/preprocessed_configs/users.xml-        <backup>
/var/lib/clickhouse/preprocessed_configs/users.xml-            <password_sha256_hex></password_sha256_hex>
/var/lib/clickhouse/preprocessed_configs/users.xml-            <networks>
/var/lib/clickhouse/preprocessed_configs/users.xml:                <ip>::/0</ip>
/var/lib/clickhouse/preprocessed_configs/users.xml-            </networks>
/var/lib/clickhouse/preprocessed_configs/users.xml-            <quota>default</quota>
/var/lib/clickhouse/preprocessed_configs/users.xml-            <grants>
/var/lib/clickhouse/preprocessed_configs/users.xml-                <query>grant backup</query>
/var/lib/clickhouse/preprocessed_configs/users.xml-            </grants>
/var/lib/clickhouse/preprocessed_configs/users.xml-        </backup>
/var/lib/clickhouse/preprocessed_configs/users.xml-    </users>
/var/lib/clickhouse/preprocessed_configs/users.xml-
/var/lib/clickhouse/preprocessed_configs/users.xml-    <!-- Quotas. -->
/var/lib/clickhouse/preprocessed_configs/users.xml-    <quotas>

Results from SELECT * FROM system.clusters:

SELECT *
FROM system.clusters
FORMAT vertical

Query id: 44287716-be2c-4f3f-84ed-6b3d4759c11d

Row 1:
──────
cluster:                 default
shard_num:               1
shard_weight:            1
internal_replication:    0
replica_num:             1
host_name:               localhost
host_address:            ::1
port:                    9000
is_local:                1
user:                    default
default_database:
errors_count:            0
slowdowns_count:         0
estimated_recovery_time: 0
database_shard_name:
database_replica_name:
is_active:               ᴺᵁᴸᴸ

Row 2:
──────
cluster:                 featbit_ch_cluster
shard_num:               1
shard_weight:            1
internal_replication:    0
replica_num:             1
host_name:               l-clickhouse101.wdn.clarkinc.io
host_address:            X.X.X.91
port:                    9000
is_local:                1
user:                    default
default_database:
errors_count:            0
slowdowns_count:         0
estimated_recovery_time: 0
database_shard_name:
database_replica_name:
is_active:               ᴺᵁᴸᴸ

Row 3:
──────
cluster:                 featbit_ch_cluster
shard_num:               2
shard_weight:            1
internal_replication:    0
replica_num:             1
host_name:               l-clickhouse102.wdn.clarkinc.io
host_address:            X.X.X.92
port:                    9000
is_local:                0
user:                    default
default_database:
errors_count:            0
slowdowns_count:         0
estimated_recovery_time: 0
database_shard_name:
database_replica_name:
is_active:               ᴺᵁᴸᴸ

Row 4:
──────
cluster:                 featbit_ch_cluster
shard_num:               3
shard_weight:            1
internal_replication:    0
replica_num:             1
host_name:               l-clickhouse103.wdn.clarkinc.io
host_address:            X.X.X.93
port:                    9000
is_local:                0
user:                    default
default_database:
errors_count:            0
slowdowns_count:         0
estimated_recovery_time: 0
database_shard_name:
database_replica_name:
is_active:               ᴺᵁᴸᴸ
Slach commented 6 months ago

/var/lib/clickhouse/preprocessed_configs/users.xml- /var/lib/clickhouse/preprocessed_configs/users.xml: X.X.X.0/24

did you run clickhouse-client on the one of host l-clickhouse101, l-clickhouse102 or l-clickhouse103 ?

could you run and share results

ssh your-user-for-connect@l-clickhouse101.wdn.clarkinc.io bash -c "nslookup l-clickhouse101.wdn.clarkinc.io; nslookup l-clickhouse102.wdn.clarkinc.io; nslookup l-clickhouse103.wdn.clarkinc.io;"

ssh your-user-for-connect@l-clickhouse103.wdn.clarkinc.io bash -c "nslookup l-clickhouse101.wdn.clarkinc.io; nslookup l-clickhouse102.wdn.clarkinc.io; nslookup l-clickhouse103.wdn.clarkinc.io;"

ssh your-user-for-connect@l-clickhouse102.wdn.clarkinc.io bash -c "nslookup l-clickhouse101.wdn.clarkinc.io; nslookup l-clickhouse102.wdn.clarkinc.io; nslookup l-clickhouse103.wdn.clarkinc.io;"

and

ssh your-user-for-connect@l-clickhouse101.wdn.clarkinc.io bash -c "clickhouse-client -h l-clickhouse102.wdn.clarkinc.io -q 'SELECT version()'"

ssh your-user-for-connect@l-clickhouse101.wdn.clarkinc.io bash -c "clickhouse-client -h l-clickhouse103.wdn.clarkinc.io -q 'SELECT version()'"
Slach commented 6 months ago
user:                    default
is_active:               ᴺᵁᴸᴸ

looks weird

BoweFlex commented 6 months ago
ssh -q l-clickhouse101.wdn.clarkinc.io bash -c "clickhouse-client -h l-clickhouse102.wdn.clarkinc.io -q 'SELECT version()'"
ssh -q l-clickhouse101.wdn.clarkinc.io bash -c "clickhouse-client -h l-clickhouse103.wdn.clarkinc.io -q 'SELECT version()'"
Code: 516. DB::Exception: Received from localhost:9000. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.

If you have installed ClickHouse and forgot password you can reset it in the configuration file.
The password for default user is typically located at /etc/clickhouse-server/users.d/default-password.xml
and deleting this file will reset the password.
See also /etc/clickhouse-server/users.xml on the server where ClickHouse is installed.

. (AUTHENTICATION_FAILED)

Code: 516. DB::Exception: Received from localhost:9000. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.

If you have installed ClickHouse and forgot password you can reset it in the configuration file.
The password for default user is typically located at /etc/clickhouse-server/users.d/default-password.xml
and deleting this file will reset the password.
See also /etc/clickhouse-server/users.xml on the server where ClickHouse is installed.

. (AUTHENTICATION_FAILED)

I ran those nslookup commands, not sure I want to share a list of IPs and name servers but the results match from all three of them.

Slach commented 6 months ago

You need to ensure when you execute clickhouse-client, source IP of TCP connection is inside allowed CIDR <ip>X.X.X.0/24</ip>

look to /var/log/clickhouse-server/clickhouse-server.err.log on l-clickhouse102 try to figure out why default user cant' connect

BoweFlex commented 6 months ago

I'm still not sure why the default account is unable to authenticate... Been trying different settings and looking through the preprocessed users.xml, and users.xml has a default user with the tag <password/> (nor do I have a default-password.xml anywhere), but I just keep receiving this error.

ClickHouse client version 24.3.2.23 (official build).
Connecting to l-clickhouse102.wdn.clarkinc.io:9000 as user default.
Code: 516. DB::Exception: Received from l-clickhouse102.wdn.clarkinc.io:9000. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.

If you have installed ClickHouse and forgot password you can reset it in the configuration file.
The password for default user is typically located at /etc/clickhouse-server/users.d/default-password.xml
and deleting this file will reset the password.
See also /etc/clickhouse-server/users.xml on the server where ClickHouse is installed.

. (AUTHENTICATION_FAILED)

I've confirmed it's not related to networking, as I can use that clickhouse_admin user without a problem from clickhouse101. I don't see much in clickhouse-server.err.log besides that error:

2024.05.01 15:54:12.779775 [ 107997 ] {} <Error> Access(user directories): from: <clickhouse101 ip>, user: default: Authentication failed: Code: 193. DB::Exception: Invalid credentials. (WRONG_PASSWORD), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000cbcedbb
1. DB::Exception::Exception<>(int, FormatStringHelperImpl<>) @ 0x0000000007668f23
2. DB::IAccessStorage::throwInvalidCredentials() @ 0x000000000fbb5098
3. DB::IAccessStorage::authenticateImpl(DB::Credentials const&, Poco::Net::IPAddress const&, DB::ExternalAuthenticators const&, bool, bool, bool) const @ 0x000000000fbb4ca1
4. DB::MultipleAccessStorage::authenticateImpl(DB::Credentials const&, Poco::Net::IPAddress const&, DB::ExternalAuthenticators const&, bool, bool, bool) const @ 0x000000000fbebbba
5. DB::AccessControl::authenticate(DB::Credentials const&, Poco::Net::IPAddress const&, String const&) const @ 0x000000000fb27ac5
6. DB::Session::authenticate(DB::Credentials const&, Poco::Net::SocketAddress const&) @ 0x0000000010ff2083
7. DB::TCPHandler::receiveHello() @ 0x000000001235f22a
8. DB::TCPHandler::runImpl() @ 0x0000000012350555
9. DB::TCPHandler::run() @ 0x000000001236d099
10. Poco::Net::TCPServerConnection::start() @ 0x0000000014c9bef2
11. Poco::Net::TCPServerDispatcher::run() @ 0x0000000014c9cd39
12. Poco::PooledThread::run() @ 0x0000000014d954a1
13. Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000014d93a3d
14. ? @ 0x00007f9959e9f802
15. ? @ 0x00007f9959e3f450
 (version 24.3.2.23 (official build))
2024.05.01 15:54:12.782820 [ 107997 ] {} <Error> ServerErrorHandler: Code: 516. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.

If you have installed ClickHouse and forgot password you can reset it in the configuration file.
The password for default user is typically located at /etc/clickhouse-server/users.d/default-password.xml
and deleting this file will reset the password.
See also /etc/clickhouse-server/users.xml on the server where ClickHouse is installed.

. (AUTHENTICATION_FAILED), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000cbcedbb
1. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000774d36c
2. DB::AccessControl::authenticate(DB::Credentials const&, Poco::Net::IPAddress const&, String const&) const @ 0x000000000fb27f8b
3. DB::Session::authenticate(DB::Credentials const&, Poco::Net::SocketAddress const&) @ 0x0000000010ff2083
4. DB::TCPHandler::receiveHello() @ 0x000000001235f22a
5. DB::TCPHandler::runImpl() @ 0x0000000012350555
6. DB::TCPHandler::run() @ 0x000000001236d099
7. Poco::Net::TCPServerConnection::start() @ 0x0000000014c9bef2
8. Poco::Net::TCPServerDispatcher::run() @ 0x0000000014c9cd39
9. Poco::PooledThread::run() @ 0x0000000014d954a1
10. Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000014d93a3d
11. ? @ 0x00007f9959e9f802
12. ? @ 0x00007f9959e3f450
 (version 24.3.2.23 (official build))
Slach commented 6 months ago

ok. <pasword/> means empty password

did you remember are you changed password for default user?

check grep -C 20 -E "password|host_regexp" -r /var/lib/clickhouse/preprocessed_configs/

maybe another directive where you setup password for default user?

Could you check reverse lookup - nslookup <clickhouse101 ip> inside and machine?

BoweFlex commented 6 months ago

Sorry for the delayed response, I ended up having these servers completely rebuilt and reinstalling/configuring clickhouse-server and clickhouse-backup to see if that fixed the problem. I'm now seeing similar issues after the fresh install and configuration, but I've noticed that in addition to receiving an error from trying to connect to another node (i.e. running clickhouse-client -h l-clickhouse101.wdn.clarkinc.io from clickhouse102 or 103) I also receive the same error when running something like sudo clickhouse-client on any of the three nodes. But on any of the three if I just run clickhouse-client it connects just fine.

I've confirmed that default still does not have a password set in the preprocessed config, and the connection info looks the same when it fails as when it succeeds, other than throwing an error. Example success:

[jbowe@l-clickhouse103 ~](DEV)$ clickhouse-client
ClickHouse client version 24.4.1.2088 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 24.4.1.

Warnings:
 * Linux transparent hugepages are set to "always". Check /sys/kernel/mm/transparent_hugepage/enabled
 * Delay accounting is not enabled, OSIOWaitMicroseconds will not be gathered. You can enable it using `echo 1 > /proc/sys/kernel/task_delayacct` or by using sysctl.

clickhouse :) exit
Bye.

Example failure:

[jbowe@l-clickhouse103 ~](DEV)$ sudo clickhouse-client
ClickHouse client version 24.4.1.2088 (official build).
Connecting to localhost:9000 as user default.
Code: 516. DB::Exception: Received from localhost:9000. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.

If you have installed ClickHouse and forgot password you can reset it in the configuration file.
The password for default user is typically located at /etc/clickhouse-server/users.d/default-password.xml
and deleting this file will reset the password.
See also /etc/clickhouse-server/users.xml on the server where ClickHouse is installed.

. (AUTHENTICATION_FAILED)

Does this help at all, and any idea what I might be doing wrong?

Slach commented 6 months ago

Issue is not related to clickhouse-backup your clickhouse-server configuration is changed from your side and looks wrong

I don't understand how you pass authentication, when run clickhouse-client under jbowe user

And don't pass when run sudo

This shall be is the same authentication request

check

curl -vvv "http://localhost:8123/?query=SELECT+version()"
BoweFlex commented 6 months ago

Ran on all three servers, and results appear to be the same. Also confirmed results are the same with or without sudo:

[jbowe@l-clickhouse103 ~](DEV)$ sudo curl -vvv "http://localhost:8443/?query=SELECT+version()"
*   Trying ::1:8443...
* connect to ::1 port 8443 failed: Connection refused
*   Trying 127.0.0.1:8443...
* Connected to localhost (127.0.0.1) port 8443 (#0)
> GET /?query=SELECT+version() HTTP/1.1
> Host: localhost:8443
> User-Agent: curl/7.76.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Tue, 14 May 2024 12:56:30 GMT
< Connection: Keep-Alive
< Content-Type: text/tab-separated-values; charset=UTF-8
< X-ClickHouse-Server-Display-Name: clickhouse
< Transfer-Encoding: chunked
< X-ClickHouse-Query-Id: 2bd546b2-d174-4a9e-942f-6e30b258a87e
< X-ClickHouse-Format: TabSeparated
< X-ClickHouse-Timezone: America/New_York
< Keep-Alive: timeout=10
< X-ClickHouse-Summary: {"read_rows":"1","read_bytes":"1","written_rows":"0","written_bytes":"0","total_rows_to_read":"0","result_rows":"0","result_bytes":"0","elapsed_ns":"999783"}
<
24.4.1.2088
* Connection #0 to host localhost left intact
Slach commented 6 months ago

run sudo tcpdump -w /tmp/clickhouse.pcap port 9000 and sudo clickhouse-client -q "SELECT version()" --verbose --send_logs_level=trace in separate terminal

if failed share logs and /tmp/clickhouse.pcap

BoweFlex commented 6 months ago

Still failed, I believe these are the relevant logs from clickhouse:

2024.05.14 09:10:53.668006 [ 125354 ] {} <Debug> TCPHandler: Connected ClickHouse client version 24.4.0, revision: 54467, user: default.
2024.05.14 09:10:53.668062 [ 125354 ] {} <Debug> TCP-Session: 41f38d19-3261-4fa2-90cd-523d622d3764 Authenticating user 'default' from 127.0.0.1:56040
2024.05.14 09:10:53.670423 [ 125354 ] {} <Error> Access(user directories): from: 127.0.0.1, user: default: Authentication failed: Code: 193. DB::Exception: Invalid credentials. (WRONG_PASSWORD), Stack trace (when
 copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c9a449b
1. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000780b9ac
2. DB::Exception::Exception<>(int, FormatStringHelperImpl<>) @ 0x0000000007819d8b
3. DB::IAccessStorage::throwInvalidCredentials() @ 0x000000000fa11158
4. DB::IAccessStorage::authenticateImpl(DB::Credentials const&, Poco::Net::IPAddress const&, DB::ExternalA
uthenticators const&, bool, bool, bool) const @ 0x000000000fa10d61
5. DB::MultipleAccessStorage::authenticateImpl(DB::Credentials const&, Poco::Net::IPAddress const&, DB::ExternalAuthenticators const&, bool, bool, bool) const @ 0x000000000fa47c3a
6. DB::AccessControl::authenticate(DB::Credentials const&, Poco::Net::IPAddress const&, String const&) const @ 0x000000000f983323
7. DB::Session::authenticate(DB::Credentials const&, Poco::Net::SocketAddress const&) @ 0x0000000010ea8b04
8. DB::TCPHandler::receiveHello() @ 0x00000000122b2ddb
9. DB::TCPHandler::runImpl() @ 0x00000000122a4375
10. DB::TCPHandler::run() @ 0x00000000122c1fb9
11. Poco::Net::TCPServerConnection::start() @ 0x0000000014c105b2
12. Poco::Net::TCPServerDispatcher::run() @ 0x0000000014c113f9
13. Poco::PooledThread::run() @ 0x0000000014d09a61
14. Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000014d07ffd
15. ? @ 0x00007f2f70e9f802
16. ? @ 0x00007f2f70e3f450
 (version 24.4.1.2088 (official build))
2024.05.14 09:10:53.670575 [ 125354 ] {} <Debug> TCP-Session: 41f38d19-3261-4fa2-90cd-523d622d3764 Authentication failed with error: default: Authentication failed: password is incorrect, or there is no user with such name.

If you have installed ClickHouse and forgot password you can reset it in the configuration file.
The password for default user is typically located at /etc/clickhouse-server/users.d/default-password.xml
and deleting this file will reset the password.
See also /etc/clickhouse-server/users.xml on the server where ClickHouse is installed.

2024.05.14 09:10:53.670851 [ 125354 ] {} <Error> ServerErrorHandler: Code: 516. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.

If you have installed ClickHouse and forgot password you can reset it in the configuration file.
The password for default user is typically located at /etc/clickhouse-server/users.d/default-password.xml
and deleting this file will reset the password.
See also /etc/clickhouse-server/users.xml on the server where ClickHouse is installed.

. (AUTHENTICATION_FAILED), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c9a449b
1. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000780b9ac
2. DB::AccessControl::authenticate(DB::Credentials const&, Poco::Net::IPAddress const&, String const&) const @ 0x000000000f983820
3. DB::Session::authenticate(DB::Credentials const&, Poco::Net::SocketAddress const&) @ 0x0000000010ea8b04
4. DB::TCPHandler::receiveHello() @ 0x00000000122b2ddb
5. DB::TCPHandler::runImpl() @ 0x00000000122a4375
6. DB::TCPHandler::run() @ 0x00000000122c1fb9
7. Poco::Net::TCPServerConnection::start() @ 0x0000000014c105b2
8. Poco::Net::TCPServerDispatcher::run() @ 0x0000000014c113f9
9. Poco::PooledThread::run() @ 0x0000000014d09a61
10. Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000014d07ffd
11. ? @ 0x00007f2f70e9f802
12. ? @ 0x00007f2f70e3f450
 (version 24.4.1.2088 (official build))

And I've placed the pcap file here.

Slach commented 6 months ago

sorry wrong tcpdump command

sudo tcpdump -i lo -w /tmp/clickhouse.pcap port 9000 run sudo clickhouse-client -q "SELECT version()" --verbose --send_logs_level=trace

and share output without sudo clickhouse-client -q "SELECT version()" --verbose --send_logs_level=trace

BoweFlex commented 6 months ago

Now I'm confused, because I swear I was able to sign in without sudo before. They are both failing with the same message now:

2024.05.14 10:09:59.373988 [ 125354 ] {} <Debug> TCPHandler: Connected ClickHouse client version 24.4.0, revision: 54467, user: default.
2024.05.14 10:09:59.374105 [ 125354 ] {} <Debug> TCP-Session: 16ec77bf-2bb9-4bf8-82f7-8cd8a2d50fd0 Authenticating user 'default' from 127.0.0.1:47706
2024.05.14 10:09:59.374294 [ 125354 ] {} <Error> Access(user directories): from: 127.0.0.1, user: default:
 Authentication failed: Code: 193. DB::Exception: Invalid credentials. (WRONG_PASSWORD), Stack trace (when
 copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c9a449b
1. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000780b9ac
2. DB::Exception::Exception<>(int, FormatStringHelperImpl<>) @ 0x0000000007819d8b
3. DB::IAccessStorage::throwInvalidCredentials() @ 0x000000000fa11158
4. DB::IAccessStorage::authenticateImpl(DB::Credentials const&, Poco::Net::IPAddress const&, DB::ExternalA
uthenticators const&, bool, bool, bool) const @ 0x000000000fa10d61
5. DB::MultipleAccessStorage::authenticateImpl(DB::Credentials const&, Poco::Net::IPAddress const&, DB::ExternalAuthenticators const&, bool, bool, bool) const @ 0x000000000fa47c3a
6. DB::AccessControl::authenticate(DB::Credentials const&, Poco::Net::IPAddress const&, String const&) const @ 0x000000000f983323
7. DB::Session::authenticate(DB::Credentials const&, Poco::Net::SocketAddress const&) @ 0x0000000010ea8b04
8. DB::TCPHandler::receiveHello() @ 0x00000000122b2ddb
9. DB::TCPHandler::runImpl() @ 0x00000000122a4375
10. DB::TCPHandler::run() @ 0x00000000122c1fb9
11. Poco::Net::TCPServerConnection::start() @ 0x0000000014c105b2
12. Poco::Net::TCPServerDispatcher::run() @ 0x0000000014c113f9
13. Poco::PooledThread::run() @ 0x0000000014d09a61
14. Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000014d07ffd
15. ? @ 0x00007f2f70e9f802
16. ? @ 0x00007f2f70e3f450
 (version 24.4.1.2088 (official build))
2024.05.14 10:09:59.374432 [ 125354 ] {} <Debug> TCP-Session: 16ec77bf-2bb9-4bf8-82f7-8cd8a2d50fd0 Authent
ication failed with error: default: Authentication failed: password is incorrect, or there is no user with such name.

If you have installed ClickHouse and forgot password you can reset it in the configuration file.
The password for default user is typically located at /etc/clickhouse-server/users.d/default-password.xml
and deleting this file will reset the password.
See also /etc/clickhouse-server/users.xml on the server where ClickHouse is installed.

2024.05.14 10:09:59.374660 [ 125354 ] {} <Error> ServerErrorHandler: Code: 516. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.

If you have installed ClickHouse and forgot password you can reset it in the configuration file.
The password for default user is typically located at /etc/clickhouse-server/users.d/default-password.xml
and deleting this file will reset the password.
See also /etc/clickhouse-server/users.xml on the server where ClickHouse is installed.

. (AUTHENTICATION_FAILED), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c9a449b
1. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000780b9ac
2. DB::AccessControl::authenticate(DB::Credentials const&, Poco::Net::IPAddress const&, String const&) const @ 0x000000000f983820
3. DB::Session::authenticate(DB::Credentials const&, Poco::Net::SocketAddress const&) @ 0x0000000010ea8b04
4. DB::TCPHandler::receiveHello() @ 0x00000000122b2ddb
5. DB::TCPHandler::runImpl() @ 0x00000000122a4375
6. DB::TCPHandler::run() @ 0x00000000122c1fb9
7. Poco::Net::TCPServerConnection::start() @ 0x0000000014c105b2
8. Poco::Net::TCPServerDispatcher::run() @ 0x0000000014c113f9
9. Poco::PooledThread::run() @ 0x0000000014d09a61
10. Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000014d07ffd
11. ? @ 0x00007f2f70e9f802
12. ? @ 0x00007f2f70e3f450
 (version 24.4.1.2088 (official build))

Here's the pcap again.

Slach commented 6 months ago

let's check clickhouse-client configuration

grep -C 10 -i password -r /etc/clickhouse-client/
grep -C 10 -i password -r ~/.clickhouse-client/

also check

clickhouse-client --config-file=/dev/null -q "SELECT version()" --verbose --send_logs_level=trace
BoweFlex commented 6 months ago
[jbowe@l-clickhouse103 ~](DEV)$ clickhouse-client --config-file=/dev/null -q "SELECT version()" --verbose --send_logs_level=trace
Poco::Exception. Code: 1000, e.code() = 0, SAXParseException: No element found in '/dev/null', line 1 column 0, Stack trace (when copying this message, always include the lines below):

0. Poco::XML::SAXParseException::SAXParseException(String const&, Poco::XML::Locator const&) @ 0x0000000014c3cffe
1. Poco::XML::ParserEngine::handleError(int) @ 0x0000000014c40b24
2. Poco::XML::ParserEngine::parse(Poco::XML::InputSource*) @ 0x0000000014c3f71e
3. Poco::XML::SAXParser::parse(String const&) @ 0x0000000014c3ed44
4. Poco::XML::DOMBuilder::parse(String const&) @ 0x0000000014c303b8
5. Poco::XML::DOMParser::parse(String const&) @ 0x0000000014c2f882
6. DB::ConfigProcessor::processConfig(bool*, zkutil::ZooKeeperNodeCache*, std::shared_ptr<Poco::Event> const&) @ 0x00000000129bf832
7. DB::ConfigProcessor::loadConfig(bool) @ 0x00000000129c3614
8. DB::Client::initialize(Poco::Util::Application&) @ 0x000000000cbb7aae
9. Poco::Util::Application::run() @ 0x0000000014c1d15a
10. mainEntryClickHouseClient(int, char**) @ 0x000000000cbcecc1
11. main @ 0x0000000007807fb8
12. ? @ 0x00007f2fece3feb0
13. ? @ 0x00007f2fece3ff60
14. _start @ 0x0000000005ea702e
 (version 24.4.1.2088 (official build))
[jbowe@l-clickhouse103 ~](DEV)$ grep -C 10 -i password -r /etc/clickhouse-client/
/etc/clickhouse-client/config.xml-    </prompt_by_server_display_name>
/etc/clickhouse-client/config.xml-
/etc/clickhouse-client/config.xml-    <!--
/etc/clickhouse-client/config.xml-        Settings adjustable via command-line parameters
/etc/clickhouse-client/config.xml-        can take their defaults from that config file, see examples:
/etc/clickhouse-client/config.xml-
/etc/clickhouse-client/config.xml-    <host>127.0.0.1</host>
/etc/clickhouse-client/config.xml-    <port>9440</port>
/etc/clickhouse-client/config.xml-    <secure>1</secure>
/etc/clickhouse-client/config.xml-    <user>dbuser</user>
/etc/clickhouse-client/config.xml:    <password>dbpwd123</password>
/etc/clickhouse-client/config.xml-    <format>PrettyCompactMonoBlock</format>
/etc/clickhouse-client/config.xml-    <multiline>1</multiline>
/etc/clickhouse-client/config.xml-    <multiquery>1</multiquery>
/etc/clickhouse-client/config.xml-    <stacktrace>1</stacktrace>
/etc/clickhouse-client/config.xml-    <database>default2</database>
/etc/clickhouse-client/config.xml-    <pager>less -SR</pager>
/etc/clickhouse-client/config.xml-    <history_file>/home/user/clickhouse_custom_history.log</history_file>
/etc/clickhouse-client/config.xml-    <max_parser_depth>2500</max_parser_depth>
/etc/clickhouse-client/config.xml-
/etc/clickhouse-client/config.xml-        The same can be done on user-level configuration, just create & adjust: ~/.clickhouse-client/config.xml
--
/etc/clickhouse-client/config.xml-                 "host" is not the same as "hostname" since you may want to have different settings for one host,
/etc/clickhouse-client/config.xml-                 and in this case you can add "prod" and "prod_readonly".
/etc/clickhouse-client/config.xml-
/etc/clickhouse-client/config.xml-                 Default: "hostname" will be used. -->
/etc/clickhouse-client/config.xml-            <name>default</name>
/etc/clickhouse-client/config.xml-            <!-- Host that will be used for connection. -->
/etc/clickhouse-client/config.xml-            <hostname>127.0.0.1</hostname>
/etc/clickhouse-client/config.xml-            <port>9000</port>
/etc/clickhouse-client/config.xml-            <secure>1</secure>
/etc/clickhouse-client/config.xml-            <user>default</user>
/etc/clickhouse-client/config.xml:            <password></password>
/etc/clickhouse-client/config.xml-            <database></database>
/etc/clickhouse-client/config.xml-            <!-- '~' is expanded to HOME, like in any shell -->
/etc/clickhouse-client/config.xml-            <history_file></history_file>
/etc/clickhouse-client/config.xml-        </connection>
/etc/clickhouse-client/config.xml-    </connections_credentials>
/etc/clickhouse-client/config.xml-    ]]>
/etc/clickhouse-client/config.xml-</config>
[jbowe@l-clickhouse103 ~](DEV)$ grep -C 10 -i password -r ~/.clickhouse-client/
grep: /home/jbowe/.clickhouse-client/: No such file or directory
Slach commented 6 months ago

try

clickhouse-client --password="" --user=default -q "SELECT version()"

Your connection to 9000 port with non empty password image

Standard connection with empty password image

looks like your password got from somewhere

BoweFlex commented 6 months ago

When I run clickhouse-client --user=default -q "SELECT version()" --password it is successful with no password. After a bit more inspection I think I've found at least part of the problem.. I am using an environment file to set CLICKHOUSE_USERNAME and CLICKHOUSE_PASSWORD for the clickhouse-backup api server's configuration, and that username/password are for the backup user I've created rather than for the default user, but clickhouse-client is attempting to use the CLICKHOUSE_PASSWORD value to connect as default. I'm going to try removing that environment file and updating /etc/clickhouse-backup/config.yml instead, and see if that resolves the problem.

BoweFlex commented 6 months ago

After reconfiguring with a config.yml file and removing those environment variables, I can use clickhouse-client locally both under my user and sudo, but am still unable to connect between nodes. I'll try another tcpdump

BoweFlex commented 6 months ago

Not sure if there's anything useful here, I'm not very familiar with reading .pcap files.

Error on 103 when trying to connect by running clickhouse-client -q 'SELECT VERSION()' -h l-clickhouse103.wdn.clarkinc.io on clickhouse101:

2024.05.15 10:52:17.536571 [ 1908 ] {} <Error> ServerErrorHandler: Code: 516. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.

If you have installed ClickHouse and forgot password you can reset it in the configuration file.
The password for default user is typically located at /etc/clickhouse-server/users.d/default-password.xml
and deleting this file will reset the password.
See also /etc/clickhouse-server/users.xml on the server where ClickHouse is installed.

. (AUTHENTICATION_FAILED), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c9a449b
1. DB::Exception::Exception(PreformattedMessage&&, int) @ 0x000000000780b9ac
2. DB::AccessControl::authenticate(DB::Credentials const&, Poco::Net::IPAddress const&, String const&) const @ 0x000000000f983820
3. DB::Session::authenticate(DB::Credentials const&, Poco::Net::SocketAddress const&) @ 0x0000000010ea8b04
4. DB::TCPHandler::receiveHello() @ 0x00000000122b2ddb
5. DB::TCPHandler::runImpl() @ 0x00000000122a4375
6. DB::TCPHandler::run() @ 0x00000000122c1fb9
7. Poco::Net::TCPServerConnection::start() @ 0x0000000014c105b2
8. Poco::Net::TCPServerDispatcher::run() @ 0x0000000014c113f9
9. Poco::PooledThread::run() @ 0x0000000014d09a61
10. Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000014d07ffd
11. ? @ 0x00007f020b69f802
12. ? @ 0x00007f020b63f450
 (version 24.4.1.2088 (official build))

PCAP File

Slach commented 6 months ago

I think default user just restricted by IP to 127.0.0.1 in your cluster

try add

secret: my-secret to remote_servers YAML definition which you used for define your distributed clusters and don't use default user for clickhouse-client did you applied https://github.com/Altinity/clickhouse-backup/issues/902#issuecomment-2082775275 and https://github.com/Altinity/clickhouse-backup/issues/902#issuecomment-2083327435 ?

BoweFlex commented 6 months ago

Good point, I apologize. I think I had made that change but did not redo it when I rebuilt the servers. I'll give that a try and report back