OpenPrinting / cups

OpenPrinting CUPS Sources
https://openprinting.github.io/cups
Apache License 2.0
957 stars 174 forks source link

load_ppd for 6000 printers takes 25mins 100% cpu #940

Open ryjogo opened 2 months ago

ryjogo commented 2 months ago

Is your feature request related to a problem? Please describe. We have some systems with over 6k printers attached, when loading up the CUPSd service, it takes around 25-30 mins to parse all the PPD's

D [11/Apr/2024:16:34:52 +0000] cupsdRegisterPrinter(p=0x555d2d5fccb0(FR7_Printer))
D [11/Apr/2024:16:34:52 +0000] Loading printer FR7_Color_Printer...
D [11/Apr/2024:16:34:52 +0000] load_ppd: Loading /var/cache/cups/FR7_Color_Printer.data...

I can currently see that this occupies a single CPU core, on a 6 core machine.

Describe the solution you'd like It would be great if we can load files multi-threaded perhaps? or even have a way to load PPD's and start the service up in the background.

Describe alternatives you've considered I'm tempted to start to split these printers into seperate CUPS instances, and stick a load-balancer in front - but that doens't work as the load-balancer does not know where to sent the JOB polling request after a print request. i.e., a print request is sent; and the subsequent job polling doesn't contain any data i can pin the connection to, so it can hop to another node; and fail.

michaelrsweet commented 2 months ago

This is a known issue and there really isn't much that we can do - there is a certain amount of state that we need to load to support printing - quite a bit of optimization has already been done but I can look into loading printers in parallel to speed things up, as well as revisiting the current caching framework (I am overhauling the localization support now as well...)

ryjogo commented 2 months ago

Fantastic! Yes please! This would help considerably. I have a task to create alias for 6k printers for another naming scheme, however I'm really nervous as this will basically double the PPDs when using classes. Thus loading 12k is unbearable.

Do you know of any solution for a workarounds as suggested? I'm referring to load balancing techniques for CUPS?

Perhaps I should make a new issue on this..

michaelrsweet commented 2 months ago

WRT load balancing options, sorry I'm not aware of anything specific. The old "CUPS browsing" implicit class method (no longer supported) and the newer "cups-browsed" stuff can be used but probably won't scale up to 6k queues.

Just an FYI - CUPS 2.5 is the "end of the line" for PPD-based print queues and the current single-threaded server architecture. We aren't going to completely re-architect cupsd for 2.5, but I will try to improve startup performance.

ryjogo commented 2 months ago

Good point actually. Is it ipp-everywhere that could help here? Would this query the printer for the first print job and cache or upon startup?

michaelrsweet commented 2 months ago

IPP Everywhere basically encapsulates everything that is parsed in and out of PPDs right now. Aside from the efficiencies of not using PPDs, other IPP standards (IPP System Service and IPP Shared Infrastructure Extensions) allow for greater scaling.

frazhome commented 2 months ago

Hello ryjogo,

on which version you are running? maybe the remove of the debugcode helps you. https://github.com/OpenPrinting/cups/commit/b546688325bb002a2493e4df3a0a7e5e3b844153

my company is running on 2.2.7 and the remove of the debug codes speeds up the hole startup.

the central cups server is configured with 3.6k printqueues and the startup was accelerated from 10min to 3min

ryjogo commented 2 months ago

@frazhome thanks for the tip, although these tests were carried out on 2.3.1