Open sbernauer opened 2 years ago
@razvan I'm not sure this can be closed, there are still open tasks
It was closed automatically by merging the ADR PR. I missed the comment.
As long as this is not implemented you can workaround with Podovrerrites e.g. like this:
To avoid searching, this is the line added to args: sed -i "s|XX_PASS_XX|$(cat /tmp/postgres-password/password)|g" /stackable/config/hive-site.xml
---
apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
metadata:
name: hive
namespace: my-namespace
spec:
image:
productVersion: 3.1.3
stackableVersion: "23.7.0"
clusterConfig:
database:
connString: jdbc:postgresql://my-postgress-connector/hive
user: hive
###### This is what will be taken from a secret ########
password: XX_PASS_XX
dbType: postgres
s3:
reference: ionos
metastore:
config:
resources:
memory:
limit: '1Gi'
####### Podoverrides to mount the volume and sed the pw from the secret
podOverrides:
spec:
containers:
- name: hive
args:
- echo copying /stackable/mount/config to /stackable/config && cp -RL /stackable/mount/config/* /stackable/config && echo copying /stackable/mount/log-config/hive-log4j2.properties to /stackable/config/hive-log4j2.properties && cp -RL /stackable/mount/log-config/hive-log4j2.properties /stackable/config/hive-log4j2.properties && keytool -importkeystore -srckeystore /etc/pki/java/cacerts -srcstoretype jks -srcstorepass changeit -destkeystore /stackable/truststore.p12 -deststoretype pkcs12 -deststorepass changeit -noprompt && echo replacing xxx_access_key_xxx and xxx_secret_key_xxx with secret values. && sed -i "s|xxx_access_key_xxx|$(cat /stackable/secrets/accessKey)|g" /stackable/config/hive-site.xml && sed -i "s|xxx_secret_key_xxx|$(cat /stackable/secrets/secretKey)|g" /stackable/config/hive-site.xml && sed -i "s|XX_PASS_XX|$(cat /tmp/postgres-password/password)|g" /stackable/config/hive-site.xml && bin/start-metastore --config /stackable/config --db-type postgres --hive-bin-dir bin
volumeMounts:
- mountPath: /tmp/postgres-password/
name: postgres-password
volumes:
- name: postgres-password
secret:
secretName: hive-postgres-secret
default:
replicas: 1
- echo copying /stackable/mount/config to /stackable/config && cp -RL /stackable/mount/config/* /stackable/config && echo copying /stackable/mount/log-config/hive-log4j2.properties to /stackable/config/hive-log4j2.properties && cp -RL /stackable/mount/log-config/hive-log4j2.properties /stackable/config/hive-log4j2.properties && keytool -importkeystore -srckeystore /etc/pki/java/cacerts -srcstoretype jks -srcstorepass changeit -destkeystore /stackable/truststore.p12 -deststoretype pkcs12 -deststorepass changeit -noprompt && echo replacing xxx_access_key_xxx and xxx_secret_key_xxx with secret values. && sed -i "s|xxx_access_key_xxx|$(cat /stackable/secrets/accessKey)|g" /stackable/config/hive-site.xml && sed -i "s|xxx_secret_key_xxx|$(cat /stackable/secrets/secretKey)|g" /stackable/config/hive-site.xml && sed -i "s|XX_PASS_XX|$(cat /tmp/postgres-password/password)|g" /stackable/config/hive-site.xml && bin/start-metastore --config /stackable/config --db-type postgres --hive-bin-dir bin
I would like to comment that this workaround has worked perfectly for me. Although I would like to understand a little bit how it works underneath., as I can appreciate that there is more to it than just replacing the database password with the placeholder, @Maleware could you comment on why?
Hive expects the password in hive-site.xml
. We have a policy of not reading product related secrets in the operators. So we put a placeholder in hive-site.xml
and use sed
to replace it in the container.
When we have implemented this ADR, the mechanism will hopefully be a bit more elegant and transparent. Until then, this is all there is to it.
Hey @ruslanguns hope this reaches you well.
Yeah indeed, that's a fix we brought up for a customer for the time being.
So for the part as a example sed -i "s|XX_PASS_XX|$(cat /tmp/postgres-password/password)|g" /stackable/config/hive-site.xml
. Others behave similar.
This doing nothing else but propagating the password from postgres-password/password ( path to a mounted secret volume ) and seding it into hive-site.xml. This is because while on gitops, you want password security and manage passwords as part of your IaC infrastructure. That's the part I've done for them and the one you interfered already correctly!
Anyway, some other commands are about to set up the keystore ( build it from scratch ) and set explicitly which type of database you are looking for. @sbernauer is a way more thorough expert then I am on this
Now I understand, so this script partly does the replacements that the same operator usually does in the hive-site.xml
file including not just the XX_PASS_XX
placeholder but also the base config and the use of the S3 credentials placeholders as well, now it all makes a lot of more sense to me. Thanks to both of you for the explanation, really appreaciated. @razvan & @Maleware
We use GitOps with FluxCD and indeed, this workaround has allowed us not to expose passwords in the repository in an insecure way.
apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
metadata:
name: hive
namespace: infrastructure
spec:
image:
productVersion: 3.1.3
clusterConfig:
database:
connString: jdbc:postgresql://postgres-databases.infrastructure.svc.cluster.local:5432/hive
user: hive_owner_user
password: __PASS__
dbType: postgres
s3:
reference: spark-s3-connection
metastore:
roleGroups:
default:
replicas: 1
podOverrides:
spec:
containers:
- name: hive
args:
- |
echo "Copying configuration from /stackable/mount/config to /stackable/config"
cp -RL /stackable/mount/config/* /stackable/config
echo "Copying Hive log4j2 properties file from /stackable/mount/log-config to /stackable/config"
cp -RL /stackable/mount/log-config/hive-log4j2.properties /stackable/config/hive-log4j2.properties
echo "Importing Java keystore to PKCS12 format for truststore"
keytool -importkeystore \
-srckeystore /etc/pki/java/cacerts \
-srcstoretype jks -srcstorepass changeit \
-destkeystore /stackable/truststore.p12 \
-deststoretype pkcs12 \
-deststorepass changeit -noprompt
echo "Replacing placeholders with secret values in hive-site.xml"
sed -i "s|xxx_access_key_xxx|$(cat /stackable/secrets/accessKey)|g" /stackable/config/hive-site.xml
sed -i "s|xxx_secret_key_xxx|$(cat /stackable/secrets/secretKey)|g" /stackable/config/hive-site.xml
sed -i "s|__PASS__|$(cat /tmp/postgres-password/password)|g" /stackable/config/hive-site.xml
echo "Starting Hive Metastore"
bin/start-metastore --config /stackable/config --db-type postgres --hive-bin-dir bin
volumeMounts:
- mountPath: /tmp/postgres-password/
name: postgres-password
volumes:
- name: postgres-password
secret:
secretName: hive-owner-user.postgres-databases.credentials.postgresql.acid.zalan.do
We want a generic way of for database connections like mysql, postgres, mongodb
This is done when
Please note that this list only includes the products that currently define the credentials within the Product CRD (which is very bad). There are other operators, which should be adopted to the new connection design as well for consistency.
Current Status
Airflow
spec.clusterConfig.credentialsSecret
points to a secretconnections.secretKey
that should be created / read by the operatorDruid
spec.clusterConfig.metaStorageDatabase
should use aDatabaseConnection
(ADR: https://github.com/stackabletech/documentation/pull/321)HBase
Hadoop HDFS
Hive
spec.clusterConfig.database
should use aDatabaseConnection
(ADR: https://github.com/stackabletech/documentation/pull/321)Kafka
NiFi
OPA
Spark
Superset
spec.clusterConfig.credentialsSecret
points to a secretconnections.secretKey
that should be created / read by the operatorTrino
ZooKeeper
These are my findings. I hope this is complete. Does not look as bad or as much as expected.