byrnereese / mkdocs-git-committers-plugin

A mkdocs plugin for displaying the last commit and a list of a file's contributors.
MIT License
42 stars 7 forks source link

Cache data fetched from GitHub between builds #12

Open squidfunk opened 2 years ago

squidfunk commented 2 years ago

First of all, great plugin! I'm evaluating adding native support for this plugin to Material for MkDocs. The main challenge is the increase in build times – they are pretty significant. For instance, the build time for Material for MkDocs went up from 10s to 77s. Would it be possible to cache results?

The plugin would only need to cache the list of contributors per file + the information obtained for each contributor. If this data is normalized, fetching contributor information should only be necessary when there's a new contributor that hasn't been seen before. Other than that, the data is pretty much static – login, name, avatar URL. The name could change, but I'd suspect this is rarely the case. Also, for a page, the data would only need to be re-fetched when the commit SHA changed.

Happy to help.

ojacques commented 2 years ago

@squidfunk, I hate to hijack this issue, but performance is one big reason I had to fork this project to create my own (I really dislike forking, but had to). The fork adds cache, tries to get as much info as possible from local git repo and tries hard to reduce hits on GitHub's API. It also uses GitHub's GraphQL API for efficiency.

https://github.com/ojacques/mkdocs-git-committers-plugin-2

I have happily used it with mkdocs-material, although an old version. Happy to help. Also happy to get my changes in this upstream repo.

squidfunk commented 2 years ago

@ojacques thanks! I've test-driven your project, but unfortuntely it produced inconsistent results for me. For example, my account (@squidfunk) was not correctly determined (login was empty). Maybe you can try it on the Material for MkDocs repository and track it down. I'm currently working on a tighter GitHub integration for this plugin and the git-revision-date(-localized) plugin.

ojacques commented 2 years ago

Maybe you can try it on the Material for MkDocs repository and track it down. I'm currently working on a tighter GitHub integration for this plugin and the git-revision-date(-localized) plugin.

Yes, I'll have a look. If this could be part of material proper, it would help me a ton. The short video is awesome (although I loved having contributors on top and not at the bottom).

ojacques commented 2 years ago

@squidfunk - after testing against mkdocs-material, I pushed a new version (0.4.0), which:

It does not cache between builds though, but makes sure to only query info for a user once. To be precise, it will try first to get the info with the email then the name if not found, then the GitHub username if still not found.

Caching between builds is not there yet, but I can work on it by saving the dictionary file at the end of site generation and attempting to load it if it exists at the beginning.

ojacques commented 2 years ago

Caching between builds is not there yet, but I can work on it by saving the dictionary file at the end of site generation and attempting to load it if it exists at the beginning.

Done in 0.4.1. Although I'm not sure this is the best way/place to save this cache file. There is a risk for it to be committed if not added as part of .gitignore.

But hey, mkdocs-material build with authors is now 7 seconds for me, after the first run!

squidfunk commented 2 years ago

Awesome, thanks for investing the time! I'll check it out the next time I work on this enhancement. I just put files to cache into a folder called .cache when I write plugins, e.g. for the built-in privacy plugin.

Yes, I'll have a look. If this could be part of material proper, it would help me a ton. The short video is awesome (although I loved having contributors on top and not at the bottom).

You will be able to easily change the location by overriding the content partial which controls the content area of the document. You'd just need to move the section to the top.

squidfunk commented 2 years ago

I can confirm that the build time went waaaaay down, awesome job 🚀 I'll now consider it safe to recommend this plugin! The improved git plugin integration will be released in the coming weeks, I'll put a link to your plugin in our documentation.

One more thing: could we maybe make the cache folder configurable? I'd like to keep my .gitignore as short as possible, which is why I'd want to save authors.json to .cache. Could we maybe add a config option for that?

@byrnereese sorry for hi-jacking this issue, we should really have discussed this over at the fork.

Edit: Material for MkDocs' plugins will now write their caches into subfolders. Maybe this is also something to consider for the mkdocs-git-committers-plugin. We're currently using:

We recommend caching on our docs, which makes use of the .cache folder. To limit confusion, it might be a good idea to settle on that name, but it's certainly not necessary. If a setting would be available, we can recommend setting it to .cache/plugin/git-committers on our documentation 😊

byrnereese commented 2 years ago

Why don't you issue a PR so we can stay in sync. No need to fork if we are all aligned.

ojacques commented 2 years ago

@byrnereese Absolutely. I will issue a PR against this repo. It may be big / hard to review. I apologize in advance. Also, I got contributions lately against my fork improving this even further. @squidfunk - I will expose the cache location as config, defaulting to the one you suggested above.

squidfunk commented 2 years ago

@byrnereese your plugin looks a little unmaintained, as there are three open issues without any answers and there's not much commit activity. I guess this is also why the fork exists. If we could gravitate towards one solution, that would be awesome, but we'd need to be able to push out bug fixes fast.

I'd expect some increase on this or @ojacques' repository, depending on what we recommend using on our docs, so the moment this natively is supported by Material for MkDocs, there's also likely to be more exposure for your plugins, more users, and with that potentially more bugs and edge cases. The question is whether you're up for maintaining it.

However, consolidation of efforts would be something to strive for.

byrnereese commented 2 years ago

I can certainly help when I am able. Happy to make you maintainer on this repository if that would give you more access, control and autonomy.

This is embarrassing, but for some reason I don't always get notified via email of new issues or PRs. Never figured out why. I will look into that so that I can be more aware of issues that come in.

squidfunk commented 2 years ago

@byrnereese you can customize notification settings per repository here:

Bildschirmfoto 2022-06-23 um 07 56 48
squidfunk commented 2 years ago

@ojacques @byrnereese insiders-4.19.0 adds native support for the git-authors and git-committers plugins. We currently recommend using @ojacques' fork because it implements caching. Otherwise, build times would skyrocket. If in the future, you both decide to merge efforts, please consider PR'ing a change to our documentation 😊 Here's how it looks:

Bildschirmfoto 2022-06-24 um 15 33 13
ojacques commented 2 years ago

This is so cool! Getting contributors listed in pages has been a big incentive for readers to be turned into contributors.

I will look at the first issue reported and continue to submit changes back to upstream.

@byrnereese - once / if you agree, happy to be made a maintainer of this repo and continue working with you.

timvink commented 2 years ago

Awesome work @squidfunk and @ojacques , really cool stuff!

byrnereese commented 1 year ago

@ojacques I have not been a good partner in maintaining this project and would like to finally resolve the fact that we have outstanding features in a fork. I have made you a collaborator on this project, and can give you rights to the pypi project as well. Let me know how best to proceed - as I would love to fold your many improvements into the plugin.

ojacques commented 1 year ago

Thanks @byrnereese! Although fulfilling the same goal, the plugin has now a very different implementation, directly getting contributors from GitHub (although not through an API). Along the way, I am not sure I did not broke other usage of this plugin (for example, it will not work with anything other than GitHub). We will have to think how to best merge the two, without breaking others.

byrnereese commented 1 year ago

@ojacques I think that is a fine direction to be honest. My plugin similarly was tied to github - although it also leveraged plain git. But the linking to people's profiles was very much github-centric. Would it make sense to do a hard merge? I think the risk of breaking people is minor as the utility of the plugin is specifically for mkdocs. But I think it would be good to canonize one of these plugins as the official one.

timvink commented 1 year ago

While we're discussing the set of git-based mkdocs plugins, it might also be relevant to discuss 'plain git'.

I feel there's potential to build a standard, lower-level mkdocs-git-info-plugin that adds meta data to page.file objects that other plugins can use (if you add an on_startup() method the plugin state is preserved and accessible for other plugins). There is a surprising amount of edge cases in retrieving git info, as I've learned from maintaining mkdocs-git-revision-date-localized-plugin and mkdocs-git-authors-plugin. It could also take care of caching and such.

I've started an implementation at https://github.com/timvink/mkdocs-git-info-plugin that has some basic working code, but it definitely needs a lot more time and careful design before I can integrate into my existing plugins.

byrnereese commented 1 year ago

@timvink I am happy to lend a hand in getting the plugin fleshed out more. What specifically will this plugin do and make available to mkdocs developers?

Taking a step back, perhaps what all of us on this thread can do is come up with a logic structure for these plugins so that they are designed to work in a standalone manner or in concert with one another.