dotnet / docfx

Static site generator for .NET API documentation.
https://dotnet.github.io/docfx/
MIT License
3.94k stars 839 forks source link

[Feature Request] Add option to debounce keystrokes in API search #10043

Open craigfowler opened 1 week ago

craigfowler commented 1 week ago

Is your feature request related to a problem? Please describe.

I use DocFX to generate the docs for a very large solution with a huge API. I like the API search feature in the template but when typing a search string, due to the size of the searched code, this introduces high web browser CPU usage and freezes my browser UI for a few seconds. I appreciate that it's probably not possible to simply make the search quicker but I noticed that the search seems to begin the very moment I type the first character of my search string. That seems to be wasting compute resources performing a search which is going to be discarded. It also means I can't see the rest of characters I'm typing into the search text box until the search has completed. Sometimes that means I have to use backspace and correct typos, triggering more wasteful search operations.

Describe the solution you'd like

I'd like a template option, likely alongside _enableSearch which is of type nullable integer. If null then the behaviour is as the template currently functions; the client side logic begins searching as soon as a character is received in the textbox. If not null and a positive integer is specified then this is the number of milliseconds that the input is debounced before a search begins. Obviously a negative integer here is nonsense; it should either be treated the same as null or should raise an error, as appropriate to DocFx's conventions.

A default value of null seems sensible, to maintain the current behaviour, for small-to-medium sized solutions. In my solution, I would try an initial debounce-timer of around 350ms. That seems long enough that someone who is typing - and knows what they are typing - can likely type it all without triggering wasteful searches. It should be short enough though, that they aren't frustrated waiting for it.

Describe alternatives you've considered

In truly large projects I imagine that the entire search functionality should be moved server-side with a completely custom impl that is outside the scope of DocFX. We don't have the capacity/motivation to do that.

I suppose I could work around the problem by copy-pasting my search term into the search text box, but that is not as convenient.

Additional context

For reference, this is the API search tool I'm referring to.

image

filzrev commented 1 week ago

Just as a point of reference, what is the size of the _site/docfx.json _site/index.json file?

I thought current docfx lurn.js based search backend have several issues.

  1. Search operation is slow when site contains many document. (It's noticeable when index.json is larger than 5MB)
  2. Search index is dynamically created from index.json on first page load timing. It consume CPU times.
  3. It require additional configuration to support multi-language site.
    And some language is not supported by docfx because it requires node.js environment.

I've tested before to switch search engine from lunr.js to Pagefind before. And it seems almost works without problems. (It requires additional tasks thought (e.g. handle <mark> tags. and supporting UI customization))

Demo Site https://filzrev.github.io/docfx.samples.pagefind/

Custom template to change search backend to pagefind https://github.com/filzrev/docfx.samples.pagefind/tree/main/docs/templates/pagefind

The advantages of using Pagefind include.

craigfowler commented 1 week ago

Just as a point of reference, what is the size of the _site/docfx.json file?

Are you sure you meant _site/docfx.json and not something else? _site/manifest.json perhaps?

There is no docfx.json file in _site (and I checked, we're using v2.76 to build our docco, so not like we're on an outdated version). Our _site/manifest.json is around 4.5MB in size. We've got a little over 13,700 documents in _site/api where our C# type documentation builds.

I'll see if I can find some time to try out that pagefind-based search template. Perhaps if it's "just generally superior" to Lunr without any downsides then perhaps DocFx could adopt it officially as a default.

craigfowler commented 1 week ago

That said, for at least one concern listed:

Search index is dynamically created from docfx.json on first page load timing. It consume CPU times.

Apparently this can already be solved in Lunr, it just needs to be configured.

filzrev commented 1 week ago

Sorry for the confusion. I'm originally intended to indicate _site/index.json file. (I've modified above comments).

craigfowler commented 1 week ago

Hmm, we don't have an index.json in the root of _site either.

The only files which are generated into the root of _site which aren't whole HTML pages or obvious non-logic assets (like favicons) are:

I've had a look at some of the subfolders and there's no index.json (or any other JSON files at all) in any of those either. We're using pretty vanilla DocFx, although I think a while ago we switched to the bundled modern template in order to activate Mermaid diagram syntax. I think that might not have been the default template when we first installed DocFx into the project.

filzrev commented 1 week ago

I've tried default/modern templates. And in both cases. index.json file is generated when running docfx build command .

This file is generated by ExtractSearchIndex PostProcessor (That is automatically added when _enableSearch: true) And without this index.json file. lurn.js based search is not works (As far as I knows)

Is it able to test following steps?

  1. Create new docfx project with docfx init --yes command.
  2. Run docfx build command.
  3. Confirm _site/index.json file is generated or not.
craigfowler commented 6 days ago

Interesting, on a blank project it is generated but on our main solution's docco project it does not.

I compared the docfx.json files from our solution and a freshly created project and there are some structural differences, including in areas that we have never edited. We would have generated an empty config file way back when we started using docfx and never did anything with those areas. For example our main solution has the following sections explicitly declared with empty arrays. A freshly-generated docfx config doesn't include these sections at all.

I suspect that this is because our config file was generated from an older version of docfx. I guess the default behaviour changed over time, we upgraded versions and left our config file unchanged. If there was a docco/release note saying "Please review/regenerate your config because things changed" then we didn't spot it.

Anyway, I'll review that docfx.json file today and try to make it (structurally) look a little more like a current fresh one. Then I'll put a docco build through CI and see what it comes up with.

craigfowler commented 6 days ago

@filzrev I have updated our config, thanks for leading me to discover that it was outdated/malformed. Our docco site has done a full CI and re-published internally. An index.json has appeared in the root of _site and it is 32.5Mb large.

filzrev commented 5 days ago

Thanks for your confirmation.

I'm also tried to reproduce problems on local environment. By using index.json file about 64.8 MB.

And the following results were obtained.