NattapongSiri / covid_cb

MIT License
2 stars 1 forks source link

Slow response time #2

Closed NattapongSiri closed 4 years ago

NattapongSiri commented 4 years ago

It may take about 40 seconds for bot to response. After investigation, this happen because our service rely on various components running on different platform at different location.

Analysis

After fresh deploy of express client on Heroku

The first message took about following response time:

  1. create-wa-session
    1. 976ms queue time
    2. 0.22ms stalled time
    3. 0.13ms request sent
    4. 2770ms waiting for response
    5. 1.13ms download content from response
  2. message_wlt_wa
    1. 303ms queue time
    2. 0.21ms stalled time
    3. 0.12ms request sent
    4. 8120ms waiting for response. Timing report on cloud functions side can be broken down into:
      1. 7690ms on whole process which can be divided into:
        1. 1870ms for input-translate function
        2. 1600ms for send-message function
        3. 1770ms for output-translate function
    5. 0.84ms download content from response

      Immediate second message

      Comparing to immediate second message where it doesn't need to create another WA session

  3. message_wlt_wa
    1. 303ms queue time
    2. 0.29ms stalled time
    3. 0.26ms request sent
    4. 2920ms waiting for response. Timing report on cloud functions side can be broken down into:
      1. 2420ms on whole process which can be divided into:
        1. 824ms for input-translate function
        2. 815ms for send-message function
        3. 581ms for output-translate function
    5. 1.15ms download content from response

Potential fix

Merge 3 translate-message sub-process into one

merge input-translate, send-message, and output-translate into single function.

Pros

This shall reduce communication overhead and spinning up time for each function.

Cons

Harder to maintain ??

NattapongSiri commented 4 years ago

Fixed by implement native single gateway deployed by using built-in docker so it has low startup time. Now, wlt_translate first spin up time down from 8120ms to 4310ms where on Cloud function side down from 7690ms to 3560ms.

NattapongSiri commented 4 years ago

We can further shave response time down by remove create WA session task out as unified gateway will create one if it is missing anyway. To achieve this, we need to modify the gateway to return session_id as well as the message itself. Thereafter, we modify client to get session_id and update it on each message send/recieve.

NattapongSiri commented 4 years ago

Implemented in 1608f68

This reduce first response time range from <1s to 10s depending on IBM Cloud workload at the moment.