apache / couchdb

Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
https://couchdb.apache.org/
Apache License 2.0
6.28k stars 1.04k forks source link

Weather report and erlang TLS distribution problem #4597

Open pavlov2000uk opened 1 year ago

pavlov2000uk commented 1 year ago

Description

Versions

2 node cluster configured and looks like starts well, log output -> _couch_replicator_clustering : cluster stable -> couch_replicatorclustering : cluster stable

When I run /weatherreport --etc /opt/couchdb/etc/ without erlang TLS distribution configured, it works -> xxxx Configuration Settings: [admins] rocky="-pbkdf2-092b2a2214c80a33e720c76cf1b778b23cd95d9d,0d8c8fcd652bc1acc14f291eaead743f,10" [admins] vagrant="-pbkdf2-6160777f3986cf1c320b01c46e631ecbea8f2910,5a95dcffdfcdc0a9cacfdb7d13a9f392,10" [chttpd] bind_address="0.0.0.0" [chttpd] port="5984" [chttpd_auth] hash_algorithms="sha256, sha" [chttpd_auth] secret="ed35b213650525d1987c810617000a23" [couch_httpd_auth] authentication_db="_users" xxxx

When I run /weatherreport --etc /opt/couchdb/etc/ with erlang TLS distribution configured, it does not work -> xxxx _['couchdb_diag9868@couchdb1.dp.home'] [warning] Could not connect to the local cluster node 'couchdb@couchdb1.dp.home', some checks will not run. ['couchdb_diag9868@couchdb1.dp.home'] [crit] Bad rpc call executing check weatherreport_check_tcp_queues: nodedown ['couchdb_diag9868@couchdb1.dp.home'] [crit] Bad rpc call executing check weatherreport_check_search: nodedown ['couchdb_diag9868@couchdb1.dp.home'] [crit] Bad rpc call executing check weatherreport_check_safe_to_rebuild: nodedown ['couchdb_diag9868@couchdb1.dp.home'] [crit] Bad rpc call executing check weatherreport_check_processmemory: nodedown xxxxx

Is it something wrong with the configuration of just weatherport is used somehow wrong?

A test table with new documents gets replicated either when erlang TLS distribution configured or not. If I try to connect to the port 9100 SSL handshake starts -> s_client -connect couchdb1.dp.home:9100

Steps to Reproduce

  1. install elrang (this step is needed as otherwise 22 version will be installed and weathereport will not run) curl -s https://packagecloud.io/install/repositories/rabbitmq/erlang/script.rpm.sh | sudo bash sudo yum install erlang -y

  2. Install couchdb as per https://docs.couchdb.org/en/stable/install/unix.html#enabling-the-apache-couchdb-package-repository sudo yum install couchdb-3.3.1-1.el8.x86_64 -y

  3. Prepare the nodes for the cluster -> https://docs.couchdb.org/en/stable/setup/cluster.html

  4. Create CA, server1cert.pem / server1key.pem && server2cert.pem / server2key.pem

  5. Enable erlang TLS distribution ->https://docs.couchdb.org/en/stable/cluster/tls_erlang_distribution.html

  6. The final changes will be

/opt/couchdb/etc/vm.args xxxxx _-name couchdb@couchdb1.dp.home

-name couchdb@couchdb2.dp.home

-setcookie SbFch4ESZSNBAF5ivZn34oKEDAT8H684V4TATER0dCxhGZwjk -kernel inet_dist_use_interface {0,0,0,0} -kernel inet_dist_listen_min 9100 -kernel inet_dist_listen_max 9200 -proto_dist inet_tls -ssl_dist_optfile /etc/couchdb/newcert/couch_ssldist.conf xxxxx

/etc/couchdb/newcert/couch_ssldist.conf (node1) xxxxx [{server, [{certfile, "/etc/couchdb/newcert/couchdb1withprivate.pem"}, {secure_renegotiate, true}]}, {client, [{cacertfile, "/etc/couchdb/newcert/CAcert.pem"}, {securerenegotiate, true}]}]. xxxxx

/etc/couchdb/newcert/couch_ssldist.conf (node2) xxxxx [{server, [{certfile, "/etc/couchdb/newcert/couchdb2withprivate.pem"}, {secure_renegotiate, true}]}, {client, [{cacertfile, "/etc/couchdb/newcert/CAcert.pem"}, {securerenegotiate, true}]}]. xxxxx

Expected Behavior

I would like to get the same output as when erlang TLS distribution is not configured.

Your Environment

https://couchdb1.dp.home:6984 {"couchdb":"Welcome","version":"3.3.1","git_sha":"1fd50b82a","uuid":"ed35b213650525d1987c8106170001fd","features":["access-ready","partitioned","pluggable-storage-engines","reshard","scheduler"],"vendor":{"name":"The Apache Software Foundation"}}

nickva commented 1 year ago

@pavlov2000uk good find, I don't think weatherreport was fixed to work with the TLS distribution yet

(Also hopefully those are not production credentials the password hashes, secret and coookie values, if so please consider editing them out and rotate your credentials).

pavlov2000uk commented 1 year ago

Thank you. Test values everywhere .... all form my lab.

pavlov2000uk commented 1 year ago

If the team is already aware of the issue, by any chance maybe some ETA can be shared?

nickva commented 1 year ago

I imagine weatherreport would need something similar to what remsh has already for TLS conf:

https://github.com/apache/couchdb/blob/main/rel/overlay/bin/remsh#L130-L134

Not sure on the timeline of the fix. I personally don't use weatherreport much so don't know much about its internals.