apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.28k stars 3.59k forks source link

Pulsar Functions Leak Database Credentials to Logs #9462

Open alexanderursu99 opened 3 years ago

alexanderursu99 commented 3 years ago

Is your enhancement request related to a problem? Please describe. When I look at logs of function workers, I can see where database passwords are revealed in plain text.

21:51:55.889 [main] INFO org.apache.pulsar.functions.worker.FunctionAssignmentTailer - Received assignment update: instance {
functionMetaData {
functionDetails {
tenant: "market_data"
namespace: "deribit"
name: "skew_iv_v2"
className: "org.apache.pulsar.functions.api.utils.IdentityFunction"
processingGuarantees: EFFECTIVELY_ONCE
autoAck: true
parallelism: 1
source {
subscriptionType: FAILOVER
typeClassName: "org.apache.pulsar.client.api.schema.GenericRecord"
subscriptionName: "clickhouse-sink"
inputSpecs {
key: "market_data/deribit/skew_iv_v2"
value {
}
}
cleanupSubscription: true
}
sink {
className: "org.apache.pulsar.io.jdbc.ClickHouseJdbcAutoSchemaSink"
configs: "{\"userName\":\"USERNAME\",\"password\":\"PASSWORD\",\"jdbcUrl\":\"jdbc:clickhouse://ADDRESS:PORT/DATABASE\",\"tableName\":\"TABLE_NAME\",\"timeoutMs\":60000.0,\"batchSize\":100000.0}"
typeClassName: "org.apache.pulsar.client.api.schema.GenericRecord"
builtin: "jdbc-clickhouse"
}
resources {
cpu: 1.0
ram: 1073741824
disk: 10737418240
}
componentType: SINK
}
packageLocation {
packagePath: "market_data/deribit/skew_iv_v2/08b8f87d-6db5-4907-a85a-4f1fac9c5d5d-pulsar-io-jdbc-clickhouse-2.6.1.nar"
originalFileName: "pulsar-io-jdbc-clickhouse-2.6.1.nar"
}
version: 7
createTime: 1612193701790
instanceStates {
key: 0
value: RUNNING
}
functionAuthSpec {
data: "viv6c"
}
}
instanceId: -1
}
workerId: "pulsar-function-0"

This is an example snippet of logs I see. I've replaced some information with placeholders in all-caps, like USERNAME and PASSWORD. Most sensitive information is seen under configs.

Note that this is also the ClickHouse JDBC sink.

Describe the solution you'd like For database credentials (at least) to not be exposed in the logs, or some option to disable it. I can already restrict access to who can administrate the cluster, but I cannot as easily restrict what logs someone is able to see in our log monitoring solution.

zymap commented 3 years ago

Move this to the next release.

nlu90 commented 3 years ago

I can help with this issue.