SKS-Keyserver / sks-keyserver

OpenPGP keyserver
GNU General Public License v2.0
133 stars 15 forks source link

Feature Request: MultiCore Support. #84

Open TheeMahn opened 3 years ago

TheeMahn commented 3 years ago

I wrote an application called Keyserver http://os-builder.com/Apps/Apps/all/binary-amd64/ultimate-edition-keyserver-1.0.0_all.deb") Not the newest, I am still writing the software.

if you type: keyserver --setup threads=32 It will scan the internet looking for the fastest server by ping. Example: Setting default keyserver: keysnatcher.space | 23 Web Spidering Server: http://keysnatcher.space/sks/ Download 13GB of files: ... Downloaded: 178 files a total of 13,741,490,176 bytes (12GB) in 0h22m30s. MD5SUMS the Files. MD5SUM Time: Timer: 0h0m21s I have multiple NVME Drives. Removing PTree Folder, we will build a new one. Removing Database, we will build a new one. Number of CPU Core(s) / threads(s) detected: 32 Launching 32 threads to accelerate the building process, please wait. Threads Specified as: 32 Beginning build: sks-dump-0001.pgp, Multi-threaded. Please wait... Once set the cores open up: Launching thread #31: Loading key(s): sks-dump-0156.pgp (RUNNING INSTANCES: 31 Internal processes: 1) Merging: sks-dump-0156.pgp, please wait. 32 at a time, actually 31, one core remains with the internal process.

(will launch 32 threads to get things done close to 32 times as fast) I have a AMD 3950X 16 core / 32 thread CPU.

End of it all I get a Ptree Database corruption. Merging: sks-dump-0174.pgp, please wait Cleaning Database, please wait. Fatal error: exception Bdb.DBError("BDB0060 PANIC: fatal region error detected; run recovery")

Single core works and is painfully slow!!! Downloaded: 178 files a total of 13,741,490,176 bytes (12GB) in 0h0m44s. Using LAN accelerating the building process. Dual 10 Gigabit Fiber.

MarcelWaldvogel commented 3 years ago

SKS was not meant for parallelism, neither in processing requests, nor in accessing the database (and it does not lock the database, to prevent one from accidentally run a second process on the database).

Options with SKS (also hinted at in the wiki include running multiple instances on individual databases. For example, I run multiple Docker instances on a machine, one doing gossip (11370), the others serving HKP (11371). I use the Gossip peer as a backup for HKP; you also might consider routing POSTs to /pks/hashquery with priority to the Gossip peer. The instances gossip among each others as well to remain up-to-date. https-portal makes it really easy to provide a HA load balancer using Nginx in front of the cluster.

To reduce disk usage and improve buffer cache, I created the database with one node only on its own ZFS file system, then shutting down the node, creating a snapshot and as many clones of that one as desired. One continues working on the main filesystem, while the additional nodes work on the clones. (I probably will re-clone every few weeks to coalesce disk usage.) The snapshots also provide a great safety net in case you accidentally run a DB command (and be it only "dump") while the server is running and ruining your database (and "db_recover" does not always help, if your KDB and PTree databases disagree).

You also might look at Hockeypuck instead, which provides higher levels of parallelism, while also supporting the SKS protocol. It is IMHO much easier to set up, but slower at importing keys from scratch (Hockeypuck took 2 days instead of under an hour for SKS). More on Hockeypuck.