thuliteio / doks

Everything you need to build a stellar documentation website. Fast, accessible, and easy to use.
https://getdoks.org
MIT License
2.1k stars 359 forks source link

index sections (`h1`, `h2`, `h3`) inside pages instead of whole content #1026

Closed adhadse closed 1 year ago

adhadse commented 1 year ago

Summary

@h-enk Coming from #801 the content might be too long and make the js heavy. Instead, we can instead along with page's title, we can iterate through individual headings and add to flexsearch index with permalink of the parent page followed by the section after #.

Basic example

Index /notes/rust/smart-pointers/deref-trait/ Then for the Page at this location, iterate through individual sections and index them like /notes/rust/smart-pointers/deref-trait/#deref-coercion-and-mutability.

This can be done either via iterating over h1, h2 & h3 in the page, and grabbing their id attribute to add to Permalink and inner body as the title for index:

<h2 id="implicit-deref-coercions">
Implicit Deref Coercions 
<a href="#implicit-deref-coercions" class="anchor" aria-hidden="true">#</a>
</h2>

Grabbing them is possible via regex:

Motivation

801

If this idea is good, we can also make it work with #801, not filling up the description a lot.

Even with a huge website of 137 pages, this doesn't consume more than 1.9kB in my case.

adhadse commented 1 year ago

Update! With some experimentation:

Updating the index.js to something like this:

(function(){

  var index = new FlexSearch.Document({
    tokenize: "forward",
    cache: 100,
    document: {
      id: 'id',
      store: [
        "href", "title", "description"
      ],
      index: ["title", "description"]
    }
  });

  // Not yet supported: https://github.com/nextapps-de/flexsearch#complex-documents

  // https://discourse.gohugo.io/t/range-length-or-last-element/3803/2

  {{ $list := slice }}
  {{- if and (isset .Site.Params.options "searchsectionsindex") (not (eq (len .Site.Params.options.searchSectionsIndex) 0)) }}
  {{- if eq .Site.Params.options.searchSectionsIndex "ALL" }}
  {{- $list = .Site.Pages }}
  {{- else }}
  {{- $list = (where .Site.Pages "Type" "in" .Site.Params.options.searchSectionsIndex) }}
  {{- if (in .Site.Params.options.searchSectionsIndex "HomePage") }}
  {{ $list = $list | append .Site.Home }}
  {{- end }}
  {{- end }}
  {{- else }}
  {{- $list = (where .Site.Pages "Section" "docs") }}
  {{- end }}

  {{ $len := (len $list) -}}
  {{ $last_index := newScratch -}}
  {{ $last_index.Set "value" $len -}}

  {{ range $index, $element := $list -}}
    {{ $RelPermalink := .RelPermalink -}}
    {{ $title := .Title | jsonify -}}
    index.add({
      id: {{ $index }},
      href: "{{ $RelPermalink }}",
      title: {{ $title }},
      {{ with .Description -}}
        description: {{ . | truncate 35 | jsonify }},
      {{ else -}}
        description: {{ .Summary | plainify | truncate 35 | jsonify }},
      {{ end -}}
    });

    {{ $sections := findRE `(?s)<h[1-3].*?>.*?</h[1-3]>` .Content }}

    {{ range $idx, $section := $sections -}}
      // get section id from raw content; doesn't work if headings are closed with mark 
     // since they'll be converted by Hugo into different id from what is raw. In that case 
     // use something other than `.Content`
      {{ $section_id := replaceRE `\"` "" (index (findRE `\"#([\S]+)\"` $section) 0) }}
      {{ $section_title := replaceRE `>|.<.` "" (index (findRE `>[\S\s]+<a` $section) 0) }}
        index.add({
            id: {{ add ($last_index.Get "value") $idx }},
            href: "{{ $RelPermalink }}{{ $section_id }}",
            title: {{ $title }},
            description: "{{ $section_title  | htmlUnescape | plainify }}",
          });
    {{ end -}}
    {{ $last_index.Add "value" (len $sections) -}}
  {{ end -}}

  search.addEventListener('input', show_results, true);
// from the above line everything is same...
}());

The first suggestion's description was very long and so truncated to 35 characters. image