timvink / mkdocs-git-revision-date-localized-plugin

MkDocs plugin to add a last updated date to your site pages
https://timvink.github.io/mkdocs-git-revision-date-localized-plugin/index.html
MIT License
193 stars 39 forks source link

Precompute all last commit timestamps in on_files #116

Open kunickiaj opened 1 year ago

kunickiaj commented 1 year ago

In parallel, precompute all last commit timestamps in on_files so that we can process more quickly. We need to do this when we have all the files so we can do the work in parallel, rather than on_page_markdown.

This does not pre-compute for first commit timestamp. Can significantly improve wall time ref: #115

Looking for some feedback on this approach. If this looks reasonable we can figure out support for the first commit timestamp as well as a way to configure parallelism. This currently takes the min of 10 or however many cpus are reported.

On an M1 Max Macbook Pro (8 performance, 2 efficiency cores) this resulted in a speed up of ~5.5x when processing a large monorepo that originally took 378 seconds down to 69 seconds. Tested on 78 markdown files rendered in a repo of approximately 700k commits and 500k files.

timvink commented 10 months ago

Sorry for the very late reply, this project has not been a priority..

Very cool PR, 5.5x improvement is considerable!

One problem I see however is using the files collection at on_files() instead of the page at on_page_markdown() . The reason is that some other plugins move files around. Here's an example mkdocs-monorepo

They basically create a new docs_dir from several source folders:

https://github.com/backstage/mkdocs-monorepo-plugin/blob/c778b3010eb986a2f3b719bc7a3d29d86236c238/mkdocs_monorepo_plugin/plugin.py#L54-L61

And then they update the page.abs_src_url :

https://github.com/backstage/mkdocs-monorepo-plugin/blob/c778b3010eb986a2f3b719bc7a3d29d86236c238/mkdocs_monorepo_plugin/plugin.py#L65-L72

So this bit from the PR will need some more edge case handling:

https://github.com/timvink/mkdocs-git-revision-date-localized-plugin/pull/116/files#diff-38d392fd1ac6a39ad46a5d047e294c69fe0f1b6aa8fc7fea3a35c1846925d21cR166-R172

timvink commented 10 months ago

Another promising avenue might be to tweak git a bit, there are a couple of settings for large repos that might git blame operations much faster:

https://www.git-tower.com/blog/git-performance/

Have you tried something like that? Might be worth documenting in this plugin

kunickiaj commented 9 months ago

Yeah, we're well aware of all those git features to make monorepos less of a pain, but it is still incredibly slow. To be fair, when updating docs for a single project or two the time hit is probably still acceptable as the application CI is going to take longer in most cases -- but if doing a bulk update across many docs in the repo it's going to time out CI. (Not to mention the $ cost of longer running CI in general).