Open zffocussss opened 2 years ago
please see https://github.com/Kong/kong/issues/9301. does your code make panic?
I do not know,what caused it terminated
Hi @zffocussss does it consistently terminate after some time, or is it due to specific traffic? Logging isn't practical through Kong plugin, maybe I should send logs to opentelemetry for the plugin :)
I think the only fatal is the one here: https://github.com/Kong/go-pdk/blob/5775452da4c69d9fc868afb04d3487049675592a/server/pbserver.go#L201
Unfortunately I don't see an obvious way of overriding this log
call to something like apmlogrus
that would help us retrieving this panic inside the APM.
Maybe we could log the output of the plugin to a file in /tmp and retrieve it to investigate the trace?
Worst case we make a fork and add this instrumentation to help your debugging...
Hi @zffocussss does it consistently terminate after some time, or is it due to specific traffic? Logging isn't practical through Kong plugin, maybe I should send logs to opentelemetry for the plugin :)
yes it was terminated after some time,but I do not know why.
one hour ago,I tried kong 3.0 + this go plugin,the worst thing happened,the output request header traceparent was lost all the time.If you have time I suggest you to try it as well.
Hi @zffocussss does it consistently terminate after some time, or is it due to specific traffic? Logging isn't practical through Kong plugin, maybe I should send logs to opentelemetry for the plugin :)
it is running inside kong 2.8.1
kong 3.0 have a plugin which support opentelemetry(https://docs.konghq.com/hub/kong-inc/opentelemetry/). Can we use it directly.is it a better way for kong to connect with elastic apm server? I am a newbie in opentracing/opentelemetry/apm.please correct me if I have some errors of concepts
https://github.com/matthyx/kong-elastic-apm/blob/c2f0552a9709b01eb3952be9010c1946ecf5be0b/main.go#L392-L395 why do you flush and close the tracer after start server?
kong 3.0 have a plugin which support opentelemetry(https://docs.konghq.com/hub/kong-inc/opentelemetry/). Can we use it directly.is it a better way for kong to connect with elastic apm server? I am a newbie in opentracing/opentelemetry/apm.please correct me if I have some errors of concepts
Yes we could use it directly... I don't know if they publish the source code of the plugin somewhere?
one hour ago,I tried kong 3.0 + this go plugin,the worst thing happened,the output request header traceparent was lost all the time.If you have time I suggest you to try it as well.
I will try it, thanks for the heads up!
why do you flush and close the tracer after start server?
server.StartServer
is blocking... until the server shuts down. If it's a "normal" exit I try to flush events to APM, but I should probably move that to a defer
.
kong 3.0 have a plugin which support opentelemetry(https://docs.konghq.com/hub/kong-inc/opentelemetry/). Can we use it directly.is it a better way for kong to connect with elastic apm server? I am a newbie in opentracing/opentelemetry/apm.please correct me if I have some errors of concepts
Yes we could use it directly... I don't know if they publish the source code of the plugin somewhere?
source code is put in https://github.com/Kong/kong/blob/master/kong/plugins/opentelemetry/handler.lua
one hour ago,I tried kong 3.0 + this go plugin,the worst thing happened,the output request header traceparent was lost all the time.If you have time I suggest you to try it as well.
Can you describe your use case? I have tried forcing kong image to version 3.0 and all seems to work... I will try to configure the lua plugin to allow comparing outputs between both plugins.
my use case is sending trace info to elastic apm server in order to implement distributed tracing in k8s cluster.
internet -> L4 loadBalancer -> kong app gateway -> application service.
Maybe it is caused by my k8s cluster.
@zffocussss btw I still plan to come back to you... I need a bit of time to bump versions in docker-compose.yml
Hi, all ✋
I tried to use otel plugin (https://docs.konghq.com/hub/kong-inc/opentelemetry/) base on elk stack 8.5, and get success.
here is sample commit. maybe can help. https://github.com/Tai-ch0802/docker-elk-for-kong/commit/318af9695e7cb77b242ababebe8129ac04d55e55
docker-compose up --build
solution: kong gateway -> otel plugin -> otel collector -> apm server -> elk stack(8.5)
** btw apm-server not support otel in 7.6
ref:
Hi guys, I'm facing the same problem. Kong: 2.8.1 (Declarative config) APM: 7.15.2 Situation 1: I'm specifying url of my service in a kong.yml file as http://host.docker.internal:8001 Starting my FastAPI service in PyCharm on host system. Result: Plugin works perfectly. Situation 2: I'm specifying url of my service in a kong.yml file as http://my_server:80 Adding dockerized FastAPI service to the compose file. Result: Plugin handles first request, then crashes with the same error as topic starter showed. Also, I noticed, that even traceparent header appears in a proxied request, I don't see elastic-apm in the kibana...
UPD. I tried to test it with kong 3.0.1 and ELK 7.17.7. Output is a little bit different, probably it could help you with debugging:
2022/12/04 14:36:34 [warn] 1119#0: 599 [kong] pb_rpc.lua:394 [elastic-apm] closed, context: ngx.timer, client: 172.19.0.1, server: 0.0.0.0:8000 2022/12/04 14:36:34 [notice] 1116#0: signal 17 (SIGCHLD) received from 1120 2022/12/04 14:36:34 [error] 1119#0: 599 connect() to unix:/usr/local/kong/elastic-apm.socket failed (111: Connection refused), context: ngx.timer, client: 172.19.0.1, server: 0.0.0.0:8000 2022/12/04 14:36:34 [notice] 1116#0: 317 [kong] process.lua:232 external pluginserver 'elastic-apm' terminated: exit 2, context: ngx.timer 2022/12/04 14:36:34 [error] 1119#0: 599 lua entry thread aborted: runtime error: ...cal/share/lua/5.1/kong/runloop/plugin_servers/pb_rpc.lua:301: connection refused stack traceback: coroutine 0: [C]: in function 'assert' ...cal/share/lua/5.1/kong/runloop/plugin_servers/pb_rpc.lua:301: in function 'call' ...cal/share/lua/5.1/kong/runloop/plugin_servers/pb_rpc.lua:358: in function 'call_start_instance' ...local/share/lua/5.1/kong/runloop/plugin_servers/init.lua:185: in function 'get_instance_id' ...cal/share/lua/5.1/kong/runloop/plugin_servers/pb_rpc.lua:385: in function 'handle_event' ...local/share/lua/5.1/kong/runloop/plugin_servers/init.lua:252: in function <...local/share/lua/5.1/kong/runloop/plugin_servers/init.lua:245>, context: ngx.timer, client: 172.19.0.1, server: 0.0.0.0:8000 2022/12/04 14:36:34 [notice] 1116#0: *317 [kong] process.lua:216 Starting elastic-apm, context: ngx.timer
@Dogrtt and @zffocussss sorry I've been neglecting this issue... @Dogrtt can you share the reproducer in a docker-compose file? Let me see if I can reproduce locally and propose a patch, thanks for your patience!
@Dogrtt and @zffocussss sorry I've been neglecting this issue... @Dogrtt can you share the reproducer in a docker-compose file? Let me see if I can reproduce locally and propose a patch, thanks for your patience!
Hi @matthyx , first of all, thank you for your work. That's link to my repo in which I tied to reproduce the issue, if you will experience some troubles with starting, please, write it here. https://github.com/Dogrtt/kong_elastic_apm_test
Hi, all ✋
I tried to use otel plugin (https://docs.konghq.com/hub/kong-inc/opentelemetry/) base on elk stack 8.5, and get success.
here is sample commit. maybe can help. Tai-ch0802/docker-elk-for-kong@318af96
docker-compose up --build
solution: kong gateway -> otel plugin -> otel collector -> apm server -> elk stack(8.5)
** btw apm-server not support otel in 7.6
ref:
really? is it stable?
@zffocussss APM 8+ requires security feature enabled, so, you can't just put your ELK behind Nginx's basic auth, so, I tried it with ELK+APM 7.17.7. In APM's docs related to the OTel, you can find info that APM supports OTLP requests since 7.13 out of the box without any collector service. I tried Kong 3.0.1's opentelemetry plugin with direct pushes to the http://apm_server:8200. My proxied services started receiving "traceparent" headers, but there was no any kong-gateway entrypoint in APM. Traces wasn't available, only agents reports from Python and C# services. Then, I added collector service and it start working. Only one my concern is that traces doesn't recognize URL templates - instead of single trace for http://my_service:80/api/files/{file_id}, it shows all of them with a real id - http://my_service:80/api/files/123, http://my_service:80/api/files/5532, etc.
so bad does it have an impact on tracing data?
@Dogrtt @zffocussss thanks for your patience, I have found the issue: https://github.com/matthyx/kong-elastic-apm/commit/8e28e66d84b15d064e754023458f174cdd117836
Can you try again with the latest code?
ok let me have a try
Hi, all ✋
I tried to use otel plugin (https://docs.konghq.com/hub/kong-inc/opentelemetry/) base on elk stack 8.5, and get success.
here is sample commit. maybe can help. Tai-ch0802/docker-elk-for-kong@318af96
docker-compose up --build
solution: kong gateway -> otel plugin -> otel collector -> apm server -> elk stack(8.5)
** btw apm-server not support otel in 7.6
ref:
why do you use otel collector? you can send metrics to apm server directly
Hi, all ✋ I tried to use otel plugin (https://docs.konghq.com/hub/kong-inc/opentelemetry/) base on elk stack 8.5, and get success.
here is sample commit. maybe can help. Tai-ch0802/docker-elk-for-kong@318af96
docker-compose up --build
solution: kong gateway -> otel plugin -> otel collector -> apm server -> elk stack(8.5) ** btw apm-server not support otel in 7.6 ref:
why do you use otel collector? you can send metrics to apm server directly
hello,In fact,you do not need load elastic-apm any more,as you make the opentelemtry plugin work already.
Hi, all ✋ I tried to use otel plugin (https://docs.konghq.com/hub/kong-inc/opentelemetry/) base on elk stack 8.5, and get success.
here is sample commit. maybe can help. Tai-ch0802/docker-elk-for-kong@318af96
docker-compose up --build
solution: kong gateway -> otel plugin -> otel collector -> apm server -> elk stack(8.5) ** btw apm-server not support otel in 7.6 ref:
why do you use otel collector? you can send metrics to apm server directly
hello,In fact,you do not need load elastic-apm any more,as you make the opentelemtry plugin work already.
Cool! How to do that?
I just follow the doc and set otel-collector-config.yml like this.
receivers:
otlp:
protocols:
# grpc:
http:
processors:
batch:
exporters:
logging:
loglevel: debug
otlp/elastic:
endpoint: {your_elastic_apm_endpoint}
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
exporters: [logging, otlp/elastic]
metrics:
receivers: [otlp]
exporters: [logging, otlp/elastic]
logs:
receivers: [otlp]
exporters: [logging, otlp/elastic]
why do you use otel collector? you can send metrics to apm server directly
And I consider elastic-apm must be if I need trace logs.
Hi, all ✋ I tried to use otel plugin (https://docs.konghq.com/hub/kong-inc/opentelemetry/) base on elk stack 8.5, and get success.
here is sample commit. maybe can help. Tai-ch0802/docker-elk-for-kong@318af96
docker-compose up --build
solution: kong gateway -> otel plugin -> otel collector -> apm server -> elk stack(8.5) ** btw apm-server not support otel in 7.6 ref:
why do you use otel collector? you can send metrics to apm server directly
hello,In fact,you do not need load elastic-apm any more,as you make the opentelemtry plugin work already.
Cool! How to do that?
I just follow the doc and set otel-collector-config.yml like this.
receivers: otlp: protocols: # grpc: http: processors: batch: exporters: logging: loglevel: debug otlp/elastic: endpoint: {your_elastic_apm_endpoint} tls: insecure: true service: pipelines: traces: receivers: [otlp] exporters: [logging, otlp/elastic] metrics: receivers: [otlp] exporters: [logging, otlp/elastic] logs: receivers: [otlp] exporters: [logging, otlp/elastic]
why do you use otel collector? you can send metrics to apm server directly
And I consider elastic-apm must be if I need trace logs.
you do not need to use a collector to forward your traces data as APM can work with otle plugin just set endpoint to apm server address (https://{apm}/v1/traces), add Authorization header if authentication is required
Would you share your otel plugin config? I can't get it to work with the endpoint https://{apm}/v1/traces.
Heres whats in the Kong logs: [error] 2303#0: *1673 [lua] handler.lua:102: process(): [otel] response error: 404, body: {"error":"404 page not found"}
otel
I think I have the plugin setup correctly, but I don't see any traces being sent to APM. My plugin config is below.
apiVersion: configuration.konghq.com/v1 config: endpoint: http://apm-server-apm-http.elasticsearch.svc.cluster.local:8200/v1/traces headers: Authorization: Bearer Secret Token kind: KongClusterPlugin metadata: annotations: kubernetes.io/ingress.class: kong labels: global: 'true' name: kong-global-opentelemetry plugin: opentelemetry
I can see the plugin as registered in the Kong console and that looks good. I see some logs like the below indicating that Spans are being traced within Kong. Unfortunately, I see nothing being sent to APM. Any ideas:
2023/05/18 19:15:27 [debug] 2311#0: *265149 [lua] handler.lua:162: [otel] total spans in current request: 1 2023/05/18 19:15:27 [debug] 2311#0: *265149 [lua] instrumentation.lua:332: runloop_log_after(): [tracing] collected 1 spans: Span #1 name=root attributes={"http.status_code":200}
otel
I think I have the plugin setup correctly, but I don't see any traces being sent to APM. My plugin config is below.
apiVersion: configuration.konghq.com/v1 config: endpoint: http://apm-server-apm-http.elasticsearch.svc.cluster.local:8200/v1/traces headers: Authorization: Bearer Secret Token kind: KongClusterPlugin metadata: annotations: kubernetes.io/ingress.class: kong labels: global: 'true' name: kong-global-opentelemetry plugin: opentelemetry
I can see the plugin as registered in the Kong console and that looks good. I see some logs like the below indicating that Spans are being traced within Kong. Unfortunately, I see nothing being sent to APM. Any ideas:
2023/05/18 19:15:27 [debug] 2311#0: *265149 [lua] handler.lua:162: [otel] total spans in current request: 1 2023/05/18 19:15:27 [debug] 2311#0: *265149 [lua] instrumentation.lua:332: runloop_log_after(): [tracing] collected 1 spans: Span #1 name=root attributes={"http.status_code":200}
I actually got this to work. The URL should not have /v1/traces at the end for APM. That being said, I see the traces being accepted by APM, but do not see them in Kibana. We just java agents as well and I do see those services in APM, but nothing for Kong.
Hi mat: Recently,I found elastic-apm is not stable,it will be terminated after some hours.any ideas?
logs is below.