manuzhang / mkdocs-htmlproofer-plugin

A MkDocs plugin that validates URL in rendered html files
MIT License
43 stars 16 forks source link

Request: cache validated URLs #9

Open rhagenson opened 3 years ago

rhagenson commented 3 years ago

It was determined during a PR here, that in its current state mkdocs-htmlproofer will re-validate URLs it has seen before.

In order to maintain an upper bound on memory consumption, perhaps a new option should be added such as cache-size: 500 which will maintain a FIFO queue of the most recent 500 validated URLs. Each possible URL would then be checked for membership in the queue and, if present, the previous validation result repeated. My original idea a month ago was to only cache valid URLs, but I think caching invalid maybe is slightly improved -- although there is possible rationale in trying to re-validate on a set of possible error codes such as 429s, as we saw in the aforementioned PR.

@manuzhang Since you requested I open this issue, is there any additional information or question you have for me? (I apologize that I took a month to open this issue following your request.)

manuzhang commented 3 years ago

Thanks for opening this request and nice suggestion !

manuzhang commented 3 years ago

@rhagenson Please check out https://github.com/manuzhang/mkdocs-htmlproofer-plugin/pull/11

johnthagen commented 3 years ago

@manuzhang Should this issue be closed since #11 has been merged?

manuzhang commented 3 years ago

I leave it open because the cache size is hardcoded currently

ssbarnea commented 1 year ago

We need caching between runs and maybe even a git tracked cache file as the current implementation does showdown the execution by more than an order of magnitude. LRU cache does not help here.

manuzhang commented 1 year ago

We need caching between runs and maybe even a git tracked cache file

@ssbarnea Can you share your usage and elaborate on your idea?

mschoettle commented 2 weeks ago

If the plugin is enabled locally, each time mkdocs serve reloads, the URLs are revalidated. So it is slowing down that use case. Although it might not be a main use case (I am assuming that most users enable it in CI only).

johnthagen commented 2 weeks ago

@mschoettle Yeah, the way I use it is a special "check urls" task that is separate from mkdocs serve

manuzhang commented 2 weeks ago

each time mkdocs serve reloads, the URLs are revalidated.

maybe we can add an option not to revalidate URLs on reload