apache / apisix

The Cloud-Native API Gateway
https://apisix.apache.org/blog/
Apache License 2.0
14.55k stars 2.52k forks source link

bug: opentelemetry `trace_id_ratio` doesn't work #11752

Open zhendongcmss opened 1 week ago

zhendongcmss commented 1 week ago

Current Behavior

set upstream

curl http://127.0.0.1:9080/apisix/admin/routes/1 -H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -i -d '
{
    "uri": "/*",
    "upstream": {
        "type": "roundrobin",
        "nodes": {
            "192.168.8.109:8200": 10
        }
    }
}'

set opentelemetry

curl http://127.0.0.1:9080/apisix/admin/global_rules/1 -H 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' -X PUT -i -d '
{
    "plugins": {
        "request-id": {
            "algorithm": "range_id",
            "header_name": "X-Request-Id",
            "include_in_response": true
        },
        "opentelemetry": {
            "sampler": {
                "name": "trace_id_ratio",
                "options": {
                    "fraction": 0.5
                }
            }
        }
    }
}'

send request with curl

curl http://127.0.0.1:9080/vv 100 times

Expected Behavior

The plugin should have a 50% chance of sending a trace, but in reality, no trace was sent, and I saw that there was no testing for this part in the test case

Error Logs

No response

Steps to Reproduce

as Current Behavior description

Environment

wklken commented 1 week ago

You can remove the plugin request-id then try again.

While the default trace_id_source of opentelemetry plugin is x-request-id, if the x-request-id is not a valid traceID, it would not been reported.

the doc: https://apisix.apache.org/docs/apisix/plugins/opentelemetry/#configuring-the-collector

zhendongcmss commented 1 week ago

I removed request-id plugin then try curl http://127.0.0.1:9080/123 -H "X-Request-Id: 1272b56c15a7866668b943071c176805" trace_id_ratio doesn't work too. apisix always send the trace.

If curl without request-id, trace_id_ratio it can work.

wklken commented 1 week ago

There is a sampled flag in the trace id, the opentelemetry plugin will respect the flag.

Maybe you should check the source code here https://github.com/yangxikun/opentelemetry-lua/blob/main/lib/opentelemetry/trace/sampling/trace_id_ratio_sampler.lua#L34