implement nginx's $request_time logging parameter

0x777 commented 4 years ago

Currently, we only log the execution time for a given request. It would be nice to measure the time from 'the arrival of first byte of the request' to the 'last byte sent as response'. This can help people debug network issues easily.

From nginx's docs:

$request_time request processing time in seconds with a milliseconds resolution; time elapsed between the first bytes were read from the client and the log write after the last bytes were sent to the client

jberryman commented 4 years ago

This makes sense to me at a high level. Could you summarize the network issue and how you ended up debugging it?

0x777 commented 4 years ago

The network issue was that the clients connected to graphql-engine (deployed on k8s with istio) were noticing that the requests take a long time to finish(1-2 seconds). However in the logs of graphql-engine, the execution times were in the order of few milliseconds. The network connection between graphql-engine and the client was the issue. If $request_time was also present in the log, it would've been easier to pin point the issue to the network layer.

0x777 commented 4 years ago

how you ended up debugging it?

By adding logging on the clients and tracing those requests in graphql-engine (through a session variable x-hasura-trace-id).

lexi-lambda commented 4 years ago

@0x777 Why is this reopened?

0x777 commented 4 years ago

@lexi-lambda Because @jberryman added the time taken to read the request body, but request_time as described in the issue is not implemented. Looks like this may not be possible with spock currently.

lexi-lambda commented 4 years ago

Got it, thank you! Maybe we need to add it at the WAI level, then?

Naveenaidu commented 3 years ago

@0x777 IIUC request_time is defined as follows:

request_time = time taken to read the request body +
                         time taken to execute that request and send the response back to client

While I was working on Server Metrics issue - I noticed that in our mkSpockAction function, we have ioWaitTime as well as serviceTime which can be used for calculating the request_time mentioned in this issue.

Referencing from the code:

ioWaitTime is retrieved at this line

(ioWaitTime, reqBody) <- withElapsedTime $ liftIO $ Wai.strictRequestBody req

serviceTime is retrieved at this line


(serviceTime, (result, q)) <- withElapsedTime $ case apiHandler of
    AHGet handler -> do
      res <- lift $ runHandler handler
      return (res, Nothing)
    AHPost handler -> do
      parsedReqE <- runExceptT $ parseBody reqBody
      parsedReq  <- onLeft parsedReqE (logErrorAndResp (Just userInfo) requestId req (reqBody, Nothing) includeInternal headers . qErrModifier)
      res <- lift $ runHandler $ handler parsedReq
      return (res, Just parsedReq)

* `withElapsedTime` is defined [here](https://github.com/hasura/graphql-engine-mono/blob/58b0033be2ac9c0c757319a1a5f735e53f08307b/server/src-lib/Hasura/Prelude.hs#L174-#L180)
```haskell
-- | Time an IO action, returning the time with microsecond precision. The
-- result of the input action will be evaluated to WHNF.
--
-- The result 'DiffTime' is guarenteed to be >= 0.
withElapsedTime :: MonadIO m => m a -> m (DiffTime, a)

If I understand all of the above correctly, will the following be a proper definition for $request parameter:

request = time taken to read the request body (ioWaitTime) + 
          time taken to execute that request and send the response back (serviceTime)

where

serviceTime = Service time, not including request IO wait time.
iotWaitTime = Request IO wait time, i.e. time spent reading the full request from the socket.

If the definition of request is correct as mentioned above, does this mean that this issue can now be progressed and we can finally log the $request_time logging parameter?

andrewloux commented 2 years ago

@0x777 Just wanted to check in on this - I'm just an invested user, wondering if there is anything I could do to contribute to push this along?

It seems like the data is there, just not exposed in the logging output at this time

hasura / graphql-engine

implement nginx's $request_time logging parameter #3547