UlyssesZh / UlyssesZh.github.io

Ulysses Zhan's blog!
https://UlyssesZh.github.io
MIT License
3 stars 3 forks source link

Multithread plugin #95

Open UlyssesZh opened 8 months ago

UlyssesZh commented 8 months ago

I tried to use your plugin in my site but it gets stuck in Generating....

--

I'm using Jekyll 4.3.2. Any pointers on how to make it work? Did you have any similar issue while developing it?

Originally posted by @rouralberto in https://github.com/jekyll/jekyll/issues/9485#issuecomment-1820726461

Hi @rouralberto! Since I do not want to deviate the discussion in the original issue, I created this new issue to reply to you.

The plugin is for my own use on my own blog, so it is possibly it may not work for other Jekyll websites, but I would love to make it usable for other websites on Jekyll 4 too.

The link of my plugin in the original issue is a permalink of the plugin in a specific commit. There has been some changes to the plugin since then, so you may first try that plugin in the HEAD of master branch.

If you still have problems, could you please link the repo of the website you are trying to build so that I can see where the problem is?

Feel free to comment this issue, but do not continue discussing about this plugin in the original issue. Keep the discussions about this specific plugin here in this issue.

rouralberto commented 8 months ago

Hi! Thanks for this! I've been working on my new Jekyll theme built from scratch and is very simple. I'm using:

Plugins:

Everything else is a pretty simple setup.

Using the latest plugin from your repo, takes around 80 seconds to build the site. Without the plugin only 5 seconds.

My repository is private, but I can give you access if you still need to have a look. I'm happy to open it to you if needed. I believe you are doing a great job =D

NOTES:

UlyssesZh commented 8 months ago

Using the latest plugin from your repo, takes around 80 seconds to build the site. Without the plugin only 5 seconds.

I may have some clue about why your website takes more time to build when using multithread.

I initially wrote this plugin to solve the problem of slow single-page rendering. Because rendering a single page is slow, in my plugin, I only convert one page per thread. If your website converts every page very fast but there are just many pages needing converting, then most time will be spent here: https://github.com/UlyssesZh/UlyssesZh.github.io/blob/5802625fdbe4c878bed7007b01c6a9f55dd75293/_plugins/multithread_rendering.rb#L60 The sleep call in the main thread here is due to my poor multithread-programming skill. There ought to be some other ways to detect job finishing. For now I can make the time of sleep here not hard-coded so that you can experiment to find the best value.

Also, if every page is converted fast, then each thread should not contain only one conversion job because most time will be wasted on thread management even if you set a small sleep time above. A better approach is to set a batch size and put that many jobs into one thread.

I may have time to do this refactoring during thanksgiving. It should not be hard.

My repository is private, but I can give you access if you still need to have a look.

I would love to. Thank you.

Also thank you for reporting the issues you found so that I can get more clues about potential problems multithread rendering may have.

rouralberto commented 8 months ago

Thank you! I invited you to the repo https://github.com/rouralberto/albertoroura-com, which is a mirror of what you can see on https://albertoroura.com.

UlyssesZh commented 8 months ago

I removed the sleep call and used a better strategy to manage the threads, and I also allowed multiple documents to be rendered in the same thread. There are now two configurable constants (CONCURRENT_JOB_COUNT and DOCUMENTS_PER_JOB) to be tweaked, and you need to find the best values yourself on your device. On my device, your site can be built in 13s multi-thread, but still slower than single thread (11s on my machine). Re-enabling Liquid template cache makes it faster by 1s further, but still slower than single thread. I think unfortunately multi-thread rendering is not a successful try with your website (or any usual website in general). However, since CPU allocation is dependent on OS, maybe you can try whether it is better? Multi-threading works great on my website because the rendering of my website involves many blocking operations (waiting for stdio communication with subprocesses). It may not be a good take for usual websites whose building is merely CPU-bound. A reasonable try is to use multi-process which will normally speedup CPU-intense jobs by using multiple CPU cores.

rouralberto commented 8 months ago

Those are good news! Let me have a play with the parameters and will let you know. I hope this helps making progress for the integration into the general Jekyll use cases. But agree, this plugin looks good for sites like yours. Mine is purely looping and rendering straight away and it doesn't take time much to process.