Open mrchypark opened 2 months ago
@mrchypark Hi, I think you should config the nginx with query_id hash header to load balance. You can use the following config:
#user nobody;
worker_processes 1;
#error_log logs/error.log;
#error_log logs/error.log notice;
#error_log logs/error.log info;
#pid logs/nginx.pid;
events {
worker_connections 1024;
}
http {
log_format main '$http_x_databend_query_id "$time_local" $host "$request_method $request_uri $server_protocol" $status $bytes_sent "$http_referer" "$http_user_agent" $remote_port $upstream_addr $scheme $gzip_ratio $request_length $request_time $ssl_protocol "$upstream_response_time"';
access_log /opt/homebrew/var/log/nginx/access.log main;
map $http_x_query_id $backend {
default backend1;
}
upstream backend1 {
hash $http_x_databend_query_id consistent;
server localhost:8000;
server localhost:8009;
}
server {
listen 8085;
location / {
proxy_pass http://$backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-DATABEND-QUERY-ID $http_x_databend_query_id;
}
}
} 作者:Databend https://www.bilibili.com/read/cv28305671/ 出处:bilibili
Current Setup
Issue
When executing queries on a large dataset, we're encountering errors.
Questions
What could be causing these errors with large datasets? Are there any recommended configurations or best practices for handling large data queries in this setup? Should we consider scaling our resources, and if so, how?
We appreciate any insights or suggestions you can provide to help us resolve this issue and optimize our Databend queries for large datasets in our Kubernetes environment. Thank you in advance for your assistance!