3scale / APIcast

3scale API Gateway
Apache License 2.0
304 stars 171 forks source link

Opentracing make including @out_of_band_authrep_action metric optional #885

Open gnunn1 opened 6 years ago

gnunn1 commented 6 years ago

[provide a description of the issue]

When using opentracing with apicast, a metric is included with the tracing for @out_of_band_authrep_action metric. I'm not sure what exactly this is, maybe the asynchronous communication of metrics and response codes? However if it is async as the name implies, is it useful for this to be included since it skews times shown in jaeger. While you can certainly drill into the request to see how things are broken out, I suspect it would be more useful to see the synchronous time only.

I'm wondering if it would be useful to make this optional via a flag in the jaeger configmap that is mounted to the apicast deployment.

Version

nginx version: openresty/1.13.6.2 built by gcc 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC) built with OpenSSL 1.1.0h 27 Mar 2018 TLS SNI support enabled configure arguments: --prefix=/usr/local/openresty/nginx --with-cc-opt='-O2 -DNGX_LUA_ABORT_AT_PANIC -I/usr/local/openresty/zlib/include -I/usr/local/openresty/pcre/include -I/usr/local/openresty/openssl/include' --add-module=../ngx_devel_kit-0.3.0 --add-module=../echo-nginx-module-0.61 --add-module=../xss-nginx-module-0.06 --add-module=../ngx_coolkit-0.2rc3 --add-module=../set-misc-nginx-module-0.32 --add-module=../form-input-nginx-module-0.12 --add-module=../encrypted-session-nginx-module-0.08 --add-module=../srcache-nginx-module-0.31 --add-module=../ngx_lua-0.10.13 --add-module=../ngx_lua_upstream-0.07 --add-module=../headers-more-nginx-module-0.33 --add-module=../array-var-nginx-module-0.05 --add-module=../memc-nginx-module-0.19 --add-module=../redis2-nginx-module-0.15 --add-module=../redis-nginx-module-0.3.7 --add-module=../ngx_stream_lua-0.0.5 --with-ld-opt='-Wl,-rpath,/usr/local/openresty/luajit/lib -L/usr/local/openresty/zlib/lib -L/usr/local/openresty/pcre/lib -L/usr/local/openresty/openssl/lib -Wl,-rpath,/usr/local/openresty/zlib/lib:/usr/local/openresty/pcre/lib:/usr/local/openresty/openssl/lib' --with-pcre-jit --with-stream --with-stream_ssl_module --with-stream_ssl_preread_module --with-http_v2_module --without-mail_pop3_module --without-mail_imap_module --without-mail_smtp_module --with-http_stub_status_module --with-http_realip_module --with-http_addition_module --with-http_auth_request_module --with-http_secure_link_module --with-http_random_index_module --with-http_gzip_static_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-threads --with-dtrace-probes --with-stream --with-stream_ssl_module --with-http_ssl_module

Steps To Reproduce

Follow the steps in this blog article: https://itnext.io/adding-opentracing-support-to-apicast-api-gateway-a8e0a38347d2

Current Result

Shows @out_of_band_authrep_action metric

Expected Result

Optionally exclude @out_of_band_authrep_action metric

mikz commented 6 years ago

@out_of_band_authrep_action reports to 3scale backend metrics and usage. It is almost asynchronous. It is not going to accept next request on the same TCP connection until it has completed.

We believe it is useful to have it and it would be not so easy for us to make that configurable right now.

Maybe @jmprusi has a different take on this.

jmprusi commented 6 years ago

I agree with @gnunn1 that this can skew the perception of "REAL" latency, so perhaps, we can make that optional.

We should test if setting opentracing off in that location will work. (Is the @out_of_band location templated by liquid?) and control that via an ENV var OPENTRACING_DISABLE_OUTOFBAND

mikz commented 6 years ago

I'd rather devise a real fix: close the request span before @out_of_band_authrep_action, so it is shown as asynchronous in the UI.

I think OpenTracing is something you use in debugging situations, not just all the time. At least to the degree of tracing every little interaction. And in that situation, I think you want to know how long it took to report to 3scale. Regardless if it makes it harder for you to see the upstream response or how long took the whole request. We can name it better, so it is more obvious it is asynchronous action to report data to 3scale. If you are interested in how long the requests took: there are logs and possibly will have that as Prometheus metrics. OpenTracing is debugging tool and should be providing debugging information.