pentium3 / sys_reading

system paper reading notes
235 stars 12 forks source link

Shenango: Achieving High CPU Efficiency for Latency-sensitive Datacenter Workloads #195

Closed pentium3 closed 2 years ago

pentium3 commented 2 years ago

https://www.usenix.org/conference/nsdi19/presentation/ousterhout

pentium3 commented 2 years ago

motivating use case:

Latency-sensitive applications(eg, memcached) may have load variation over time. Peak load requires significantly more cores than average load. So many cores are wasted when the load is low.

To improve CPU utilization, people use multiplexing in datacenter network: run multiple applications on the same server.

image


motivating experiment:

Run memcached(1us to response to client) and a batch processing job(use any CPU cycles not used by memcached) on a server. The highest processing capacity of memcached is 6M req/s

image

Goal: best case in theory. when input rate of memcached<6M, memcached achieves low latency. The lower input rate for memcached job, the higher throughput we can get for batch processing job.

Linux: high latency even with low input rate for memcached job. Also low thoughput for batch job

ZygOS: can not multiplex. dedicate all CPU cores to memcached job -> no throughput for batch processing job

No existing approach provides high network performance and high CPU efficiency simultaneously.


Goal

Reallocate cores across applications at microsecond granularity