OpenPrinting / cups

OpenPrinting CUPS Sources
https://openprinting.github.io/cups
Apache License 2.0
1.05k stars 187 forks source link

cupsd only use one CPU core #943

Closed rickgongtds closed 5 months ago

rickgongtds commented 5 months ago

Cupsd is running in docker with 4 CPUs, everyday hundreds of thousands documents are being printed, the workload is heavy.

But when i monitor the cupsd process in container with the "top" command, the max CPU usage is 100%. Since 4 CPUs are granted to docker container, the max CPU usage should be 400% right? It seems the other 3 CPUs are never used by cupsd process.

I supposed even if i run cupsd in VMware instead of docker, the cupsd still can use only one CPU core? Even i have 10 CPU cores, the cupsd can only use a single CPU core?

CUPS version: 2.3.3. OS: Rocky9

michaelrsweet commented 5 months ago

"top" CPU usage isn't a great measure since it doesn't really tell you whether one CPU is at 100% or four CPUs are at 25%. You need to look at the kernel and per-process information to know how much time is being spent actually sending/receiving network data as well.

TL;DR summary: Yes, cupsd is single-threaded but moving to a multi-threaded design will not significantly speed up request processing or printing in general.

Longer reply:

cupsd uses a 25+ year old design and is mostly single-threaded - there are some minor things that we farm out to separate threads today, but the bulk of request processing happens from the main thread. Requests and responses are streamed to/from clients using non-blocking asynchronous I/O managed by the http_t finite state machine. When CUPS was first released this design supported 50-100 requests/sec on the computers of the time - today's systems easily support 800-1000 requests/second. The bottleneck has always been the network, how quickly cupsd can assemble/disassemble network packets, and how efficient the kernel interface is (i.e. how much copying of data).

Longer running tasks (like printing) are farmed out to sub-processes rather than threads for a bunch of reasons - first because threading in 1997 was not well supported (POSIX was new and very few OS's had multi-thread capable libraries), and second because using separate processes provides some protection from common attacks (malicious PDF files, etc.) These sub-processes get farmed out to other CPUs/cores automatically so that a server that is busy printing will show a lot of activity on all the CPUs/cores, not just one. Even so, the print filters and backends are sending printer-ready data (ranging from megabytes to gigabytes) to relatively slow printers, so CPU usage there is often moderated because your server is waiting for the printer to "catch up".

For CUPS 3.0 we are moving to a different design based on my PAPPL project where a new thread is created for each client, with appropriate MT-safe data handling, which simplifies the code a bit (we have to sometimes go through some coding gymnastics to support certain operations in CUPS 2.x). However, this does not significantly speed up request processing - the same limitations of network bandwidth, kernel overhead, and request/response processing/coding are still present.