NOAA-RDHPCS / noaa-rdhpcs.github.io

NOAA RDHPCS documentation
https://docs.rdhpcs.noaa.gov/
Creative Commons Zero v1.0 Universal
1 stars 7 forks source link

Better Search #84

Open underwoo opened 3 weeks ago

underwoo commented 3 weeks ago

The built-in Sphinx search may be too basic for our needs. It works with very simple search terms (e.g., "batch" instead of "submitting a job"). We may be able to leverage Google, or some other tool to improve the search indexing. Sphinx has information on how to add Google search to a site. Some of it is repeated here to help with the conversation.

Google Search

To replace Sphinx’s built-in search function with Google Search, proceed as follows:

  1. Go to https://cse.google.com/cse/all to create the Google Search code snippet.

  2. Copy the code snippet and paste it into _templates/searchbox.html in your Sphinx project:

<div>
   <h3>{{ _('Quick search') }}</h3>
   <script>
      (function() {
         var cx = '......';
         var gcse = document.createElement('script');
         gcse.async = true;
         gcse.src = 'https://cse.google.com/cse.js?cx=' + cx;
         var s = document.getElementsByTagName('script')[0];
         s.parentNode.insertBefore(gcse, s);
      })();
   </script>
  <gcse:search></gcse:search>
</div>
  1. Add searchbox.html to the html_sidebars configuration value.
underwoo commented 3 weeks ago

I played around a bit, and while we will need to perform some theme overrides I think I have something we can use (and keep a cleaner look than the above gives).

What I include here is a hack, but for now it works:

  1. Reduce what is in the _templates/searchbox.html file to keep the current search box (I think it looks better than the Google search box):
    <div role="search">
    <form id="rtd-search-form" class="wy-form" action="/search.html" method="get">
      <input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
    </form>
    </div>
  2. Modify the built search.html file with the following:
    +     <script async src="https://cse.google.com/cse.js?cx=_cx_number_"></script>
    </head>
    -   <div id="search-results">
    +   <div class="gcse-searchresults-only">

    where _cx_number_ is the Google Programmable Search ID. (NOAA has this disabled for users, but we should ask if we can get one for this site. I used my personal Google account for testing.)

It appears we can add custom search aliases (e.g., associate the term "batch" with "Slurm" in the search results). If I understand what the option is correctly, we can use this to help associate system names with particular sites (e.g., "orion" and "hercules" can be associated with "MSU"). 🤞

underwoo commented 2 weeks ago

Added a draft PR (#86) as an example on how to use Google as the search engine.