Open SukkaW opened 4 years ago
cc @hexojs/core @curbengh @stevenjoezhang @jiangtj @segayuu @YoshinoriN @JLHwung
Should we update minimum required Node.js version to 12? Although Hexo 5.0.0 might not require such a high Node.js version, but we could bring up more features during Hexo 5.x development.
Should we update minimum required Node.js version to 12? As you can see, the example I given is suitable for some of filters
I'm ok with bumping to Node 12, as long as only filters are affected to minimize the delay 5.0.0. Perhaps only change 1-2 filters for now, then other filters can be updated during 5.x.
@curbengh We could even release 5.0.0 first, then add multi core support from 5.1.0.
We could even release 5.0.0 first
It would better to have at least one filter that utilize this API to justify bumping to Node 12 (and demonstrate the benefit of that bump) in 5.0.0.
@curbengh
We could start with backtick_code
filter.
Take a look at the flamegraph: https://29e28e2d8f6f8fdb247ad2c47788857d003fd894-12-hexo.surge.sh/flamegraph.html
It seems to be a long task.
This is nice. I have a very good experience with piscina. it's a nice wrapper (and more) around worker_threads
.
@tuananh LGTM! It seems definitely better than my WorkerPool: https://github.com/hexojs/hexo-util/pull/212/
I gave it a try to optimize backtick_code
but got DataCloneError
error.
haven't gotten around fixing it yet. Not sure if it has anything to do with the way hexo calls all the filter
return Promise.each(filters, filter => Reflect.apply(Promise.method(filter), ctx, args).then(result => {
args[0] = result == null ? args[0] : result;
return args[0];
})).then(() => args[0]);
haven't gotten around fixing it yet. Not sure if it has anything to do with the way hexo calls all the filter
@tuananh The entire hexo
context just can not be passed to a worker. Only simple objects (like string, number, plain object) can be passed to a worker.
Here's what we can learn #4368
According to the documents of the worker_threads:
value
will be transferred in a way which is compatible with the HTML structured clone algorithm.
Which means:
Function
objects cannot be duplicated by the structured clone algorithm; attempting to throws aDATA_CLONE_ERR
exception.
structured clone algorithm
also means contacting with threads is expensive, just like creating & destroying one.
We should keep the input
and output
pure and simple (only contains required information) to make structured clone faster.
@SukkaW that's probably it. in order to change that, we need to change the way we pass hexo
instance around?
Instead of worker_threads
, I am considering using cluster
API instead.
cluster
API is much simpler, and is stable since Node.js 4.0. It has no "structured clone algorithm" things as well.
The only problem is cluster
is designed to handle multi http requests. We have to find a way to adopt it to Hexo.
@curbengh @tuananh
From the perspective of 2024, the support for multithreading in Node.js has not improved. The rendering process of posts heavily relies on Hexo's ctx
, but without the ability to use shared memory, worker threads cannot directly access the global variables in Hexo.
From the perspective of 2024, the support for multithreading in Node.js has not improved. The rendering process of posts heavily relies on Hexo's
ctx
, but without the ability to use shared memory, worker threads cannot directly access the global variables in Hexo.
So this basically leaves us with 2 options:
From the perspective of 2024, the support for multithreading in Node.js has not improved. The rendering process of posts heavily relies on Hexo's
ctx
, but without the ability to use shared memory, worker threads cannot directly access the global variables in Hexo.So this basically leaves us with 2 options:
* Creating multiple Hexo instances in different worker threads. In every thread, we will read the config and posts. * Offloading limited heavy tasks to the worker threads (markdown rendering? nunjucks rendering?) while retaining one main Hexo instance.
option 2 sounds better to me
Since #550, the original creator of Hexo, @tommy351 want to speed up Hexo with multi core rendering. However, the #550 is never continued due to the difficulties of managing multiple Hexo instance.
Recently I have brought up Node.js
worker_threads
for a project (https://github.com/OI-wiki/OI-wiki/pull/2288) and learned something aboutworker_threads
. With Node.js add support forworker_threads
, it is now possible to bring up multi core rendering for Hexo again.Limit
Workers Thread is designed to run CPU intensive tasks with simple algorism:
Thus we cannot run many difficult functions inside workers.
Design
As creating workers and destroy workers is still expensive (worker_threads are required to contact with main_thread), we should only create limited number of
worker_threads
(In https://github.com/OI-wiki/OI-wiki/pull/2288 I use the length of CPU Threads). Thus, aWorkerPool
util should be made.The
WorkPool
is designed to queue the task, manage task and make sure next task would run in an idle worker, thus it should have those method:init()
: Init a worker pool with the queue (the queue could be an array). This will be called inconstructor
.run(input)
: add a task to the queue, withinput
passed to the workers. APromise
will be returned (the result could be retrieved byconst output = await workerPool.run(input)
).destroy()
: after all tasks is finished, destroy all the worker_threads created.And here is an example about how to use
WorkPool
:As you can see, the example I given is suitable for some of filters (likes
meta_generator
,backtick_code_filter
) that we pass input to the filter and get output from it. But for more complicated job (like post rendering & template rendering) workers_thread still can't help.cc @hexojs/core @tommy351