grafana / alloy

OpenTelemetry Collector distribution with programmable pipelines
https://grafana.com/oss/alloy
Apache License 2.0
1.39k stars 203 forks source link

Alloy otelcol.receiver.otlp client failing Collection error: failed to upload metrics: rpc error: code = ResourceExhausted desc = grpc: received message larger than max (5523869 vs. 4194304) #622

Closed Ronmck closed 6 months ago

Ronmck commented 6 months ago

I am using a grpc a client sending metrics to Alloy, the client is failing with failing Collection error: failed to upload metrics: rpc error: code = ResourceExhausted desc = grpc: received message larger than max (5523869 vs. 4194304

I am using this Alloy with this config: logging { level = "debug" }

// max_recv_msg_size Maximum size of messages the server will accept. 0 disables a limit. otelcol.receiver.otlp "default" { grpc { endpoint = "localhost:4317" max_recv_msg_size = 0 } output { metrics = [otelcol.exporter.prometheus.default.input] } }

otelcol.exporter.prometheus "default" { forward_to = [prometheus.remote_write.local.receiver] }

prometheus.remote_write "local" { endpoint { url ="http://localhost:51001/api/v1/write" } }

I tried to use max_recv_msg_size to set no limit by specifying max_recv_msg_size = 0 and max_recv_msg_size ="0" neither value a any affect, specifying any thing greater than zero cause this error:

Apr 22 14:56:42 vnl00006019 systemd[1]: Failed to start Vendor-agnostic OpenTelemetry Collector distribution with programmable pipelines. Apr 22 14:56:42 vnl00006019 systemd[1]: Unit alloy.service entered failed state. Apr 22 14:56:42 vnl00006019 systemd[1]: alloy.service failed

when I use the otelcol-contrib agent it works with config: receivers: otlp: protocols: grpc: endpoint: localhost:4317 max_recv_msg_size_mib: 10000

tpaschalis commented 6 months ago

Hey @Ronmck thanks for the report! Hmm, this sounds weird, the field looks wired in properly, taking a look.

In the meantime, can you try adding a bigger value as your max_recv_msg_size?

specifying any thing greater than zero cause this error

The values work as a unit string, so for example you should define it as max_recv_msg_size = "10000MiB"

Ronmck commented 6 months ago

I have added max_recv_msg_size = "10000MiB" and this worked

See Web debug console, showing argument. Alloy Config

hainenber commented 6 months ago

@Ronmck can you try if "0KiB" also resolves the issue? I tried it and the argument isn't shown in the Web UI. Still, I'd love to see if the limit is disabled and helps with the large message size as your case. Many thanks!

Ronmck commented 6 months ago

I tried "0KiB" this had no affect, does not show in the Web UI and my OTEL client errors with: error: code = ResourceExhausted desc = grpc: received message larger than max (5492725 vs. 4194304)"

hainenber commented 6 months ago

Thanks @Ronmck, in that case, I believe we should remove the clause stating that "0" in any form would disable the max_recv_msg_size attribute. LMK what you think of this.

Ronmck commented 6 months ago

Yes you should remove the clause stating that "0"" will disable the max_recv_msg_siize and give a sample value, that indicates the value is a unit of size like "10000MiB", If "MiB" the only unit size available why specify it ? and just specify a number.

hainenber commented 6 months ago

Thanks for your response!

If "MiB" the only unit size available why specify it ? and just specify a number.

Any units associated with alecthomas/units should be usable though. Changing the format from unit string to number might introduce breaking changes so let's keep it as is for now