Open rwaweber opened 3 years ago
@binarylogic, do you have any updates on this feature?
As a workaround, we are using the following remap transformation:
read_proxy_protocol_v2:
type: remap
inputs: ["convert_base64"]
drop_on_error: true
drop_on_abort: false
source: |
# Code based in https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
message = string!(decode_base64(.base64_encoded_message) ?? .message)
if (starts_with(message, decode_base16!("0D0A0D0A000D0A515549540A")) && length(message) >= 16) {
protocol_and_command = chunks(encode_base16(slice!(message, 12, 13)), 1)
# Only version 2 of the protocol is supported
if (protocol_and_command[0] == "2") {
command = get!(["local", "proxy"], [to_int!(protocol_and_command[1])])
family = "unspec"
protocol = "unspec"
if (command == "proxy") {
family_and_protocol = chunks(encode_base16(slice!(message, 13, 14)), 1)
family = get!(["unspec", "inet", "inet6", "unix"], [to_int!(family_and_protocol[0])])
protocol = get!(["unspec", "tcp", "udp"], [to_int!(family_and_protocol[1])])
if (family == "inet") {
.host = ip_ntop(slice!(message, 16, 20)) ?? "invalid"
.port = parse_int!(encode_base16(slice!(message, 24, 26)), 16)
}else if (family == "inet6") {
.host = ip_ntop(slice!(message, 16, 33)) ?? "invalid"
.port = parse_int!(encode_base16(slice!(message, 50, 52)), 16)
}
}
.proxy_protocol_v2 = {
"command": command,
"family": family,
"protocol": protocol,
}
beginning_message = 16 + parse_int!(encode_base16(slice!(message, 14, 16)), 16)
.message = slice!(message, beginning_message)
}
}
Which can be tested with the following test:
tests:
- name: syslog / proxy protocol / proxy information properly read
inputs:
- type: log
insert_at: read_proxy_protocol_information
log_fields:
timestamp: ignore
host: "10.220.5.157"
port: 12345
base64_encoded_message: "DQoNCgANClFVSVQKIREAVNRq/Z4K3AWd2X4ZcgMABKT9EmUEAD4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHRlc3QK"
outputs:
- extract_from: read_proxy_protocol_information
conditions:
- type: vrl
source: |-
event = compact(. | { "base64_encoded_message": null })
assert_eq!(event, {
"timestamp": "ignore",
"host": "212.106.253.158",
"port": 55678,
"proxy_protocol_v2": {
"command": "proxy",
"family": "inet",
"protocol": "tcp"
},
"message": "test\n"
})
Current Vector Version
0.12.1
Use-cases
Similar in motivation to https://github.com/timberio/vector/issues/6763 but likely a bit more complicated in implementation.
The motivation is roughly the same, in terms of wanting to retain the source address of an event producer that ships to a loadbalancer, whose address then gets set as the backing address.
A loose overview of the PROXY protocol, as well as piece of software that support it, is available from the HAProxy folks at [1]. Alot more detail can found with the spec at [2].
I think that this is also the same protocol that is used by both AWS NLBs[3] and GCP's TCP Loadbalancing service[4]. Which opens the door for some interesting out-of-the-box solutions.
Attempted Solutions
None yet, unfortunately. Though if I were to guess, this is likely going to be a bit more complicated than implementing XFF header extraction.
Apologies for linking a blog post, but this could also provide some interesting insights as to how one could go about implementing their own PROXY protocol parser[5] in addition to some pretty comprehensive research too.
Proposal
Adding PROXY protocol support to socket and syslog connections(of the TCP variant only).
References
[1] HAProxy PROXY protocol thousand foot view: https://www.haproxy.com/blog/haproxy/proxy-protocol/ [2] HAProxy PROXY protocol spec: https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt [3] https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-target-groups.html#proxy-protocol [4] https://cloud.google.com/load-balancing/docs/tcp/setting-up-tcp#proxy-protocol [5] https://seriousben.com/posts/2020-02-exploring-the-proxy-protocol/#parsing-version-1