elementor / wp2static

WordPress static site generator for security, performance and cost benefits
https://wp2static.com
The Unlicense
1.39k stars 258 forks source link

Crawling 150 pages/minute: is this expected speed? #882

Closed torcd closed 1 year ago

torcd commented 1 year ago

OS: win x64 HDD: spinning WAMP: xampp

PHP max_execution_time: Unlimited PHP memory_limit: 1024M
Uploads directory: writable PHP version: 7.4.29 (thread-safe)
cURL extension loaded: Yes
WordPress Permalinks Compatible: Yes Apache: 2.4 MariaDB: 10.4

The test Wordpress site is very lightweight and 100/100 lighthouse even when server-rendered.

This system crawls only 150 pages/minute with the mysql daemon using ~1-2MB/s disk I/O (HDD). For comparison, on the exact same system, the TinaCMS Wordpress plugin exports ~3,000 pages/minute into Markdown, ie 20x the speed with the same mysqld disk I/O.

I appreciate that composing a full page over the wordpress API has overhead compared to a text-only export into Markdown, but is 20x slower expected performance? If true, a 10,000-page site would take more than 1 hour to export every time global templates or pagination are changed. :((

As described, the system passes all WP2Static health checks and PHP limits have been removed. The 1-2 MB/s mysqld disk speed feels slow, even for a HDD and especially compared to node.js read/write, but not sure if this is relevant or if it can be improved.

Is there some kind of a known bottleneck during the crawling stage that I am missing? eg the MySQL config or the API/plugin?

john-shaffer commented 1 year ago

develop branch has a Crawl Concurrency option that speeds up crawling significantly.

wp2static does most of its writes directly to files on disk, and not to mysql. There is a lot of room for optimization, though.