elastic / otel-profiling-agent

The production-scale datacenter profiler (C/C++, Go, Rust, Python, Java, NodeJS, .NET, PHP, Ruby, Perl, ...)
Apache License 2.0
2.16k stars 229 forks source link

Report invalid profiles #62

Open junotx opened 3 days ago

junotx commented 3 days ago

I tried to configure it to report profiles to a server which implements the ProfilesServiceServer interface, but it report the error:

INFO[0000] Starting OTEL profiling agent v0.0.0 (revision main-41f251a7, build timestamp 1719993165) 
INFO[0000] Interpreter tracers: perl,php,python,hotspot,ruby,v8,dotnet 
INFO[0000] Automatically determining environment and machine ID ... 
INFO[0000] Environment: hardware, machine ID: 0xd391511e7677f6b0 
INFO[0000] Assigned ProjectID: 1 HostID: 1409997349622118064 
INFO[0000] Found offsets: task stack 0x20, pt_regs 0x3f58, tpbase 0x1528 
INFO[0000] Supports generic eBPF map batch operations   
INFO[0000] Supports LPM trie eBPF map batch operations  
INFO[0000] eBPF tracer loaded                           
INFO[0004] Attached tracer program                      
INFO[0004] Attached sched monitor                       
INFO[0004] Environment variable KUBERNETES_SERVICE_HOST not set 
ERRO[0010] Request failed: rpc error: code = Unknown desc = invalid request: invalid profile: sample value length 1 does not match sample type length 143 
ERRO[0016] Request failed: rpc error: code = Unknown desc = invalid request: invalid profile: sample value length 1 does not match sample type length 70 
ERRO[0020] Request failed: rpc error: code = Unknown desc = invalid request: invalid profile: sample value length 1 does not match sample type length 84 
ERRO[0026] Request failed: rpc error: code = Unknown desc = invalid request: invalid profile: sample value length 1 does not match sample type length 50 
ERRO[0030] Request failed: rpc error: code = Unknown desc = invalid request: invalid profile: sample value length 1 does not match sample type length 99

It looks that reported profiles do not follow the spec: https://github.com/open-telemetry/oteps/blob/main/text/profiles/0239-profiles-data-model.md#message-sample requires that all samples must have the same number of values, the same as the length of Profile.sample_type.

florianl commented 3 days ago

Hi @junotx

Thanks for reaching out! Is the implementation you are using available somewhere? The might be confusion be the shown log message. Are these logs referencing Sample.value and Profile.sample_type?

junotx commented 3 days ago

@florianl yes, the reference and returned error in the server code is here

brancz commented 2 days ago

@junotx the otel agent and Parca aren't compatible with each other yet. There are a few things to work out about how the agent/parca are supposed to interpret the data. I'll start a separate thread about it this specific case.

brancz commented 2 days ago

Opened a thread here: https://github.com/elastic/otel-profiling-agent/issues/63