Speeding Up Requests - Githubissues

Some investigation has shown that the vast majority of Time To First Byte (TTFB) comes from (1) time to send the HTTP request, and (2) model inference time. Both of these vary by request size. Here are two examples:

30MB file request; TTFB = 108s Request send time: 76s Upload to GCS: 1s Loading model: 7s Model inference: 24s Upload results to GCS: <1s

6MB file request; TTFB = 28s Request send time: 15.5s Upload to GCS: <1s Loading model: <1s Model inference: 11s Upload results to GCS: <1s

To speed up individual requests, we can potentially resize the files on the client side before upload. We need to confirm that decreasing image size will not degrade model performance (up to a certain point).

To speed up multiple-image requests, we should process multiple-images as multiple requests instead of as one large request...that would speed up a 10 image request by almost 10x.

Side note: It's not clear why time to load the model changes with files size (needs investigation).

thebaumlaboratory / PlasmoCount

Speeding Up Requests #5