Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable.
I have observed that when we visualize hystrix dashboard the service p90, p99s are always lesser than the timeout set for the hystrix command whereas the external service is actually having a much higher p90, p99.
My understanding is that hystrix uses two threads. One for timeout/fallback other for service execution. My observation above suggests me that whenever there is a timeout, hystrix timeout/fallback thread sends an event for the timeout, which is also used for latency metrics, but the thread which is executing the code is not sending any event to the stream, hence leading to wrong latency metrics. Is this true?
I have observed that when we visualize hystrix dashboard the service p90, p99s are always lesser than the timeout set for the hystrix command whereas the external service is actually having a much higher p90, p99.
My understanding is that hystrix uses two threads. One for timeout/fallback other for service execution. My observation above suggests me that whenever there is a timeout, hystrix timeout/fallback thread sends an event for the timeout, which is also used for latency metrics, but the thread which is executing the code is not sending any event to the stream, hence leading to wrong latency metrics. Is this true?