sphinx-contrib / sphinx-pretty-searchresults

Sphinx: pretty search results is an extension for the Sphinx documentation tool. To display search results, Sphinx is fetching the source files of search hits and rendering excerpts in raw markup. This extension removes the markup from these source files (during build time), so the search results look decent.
https://pypi.python.org/pypi/sphinxprettysearchresults
MIT License
22 stars 5 forks source link

Search tries to show .rst.txt files instead of .txt #10

Closed mherbold closed 7 years ago

mherbold commented 7 years ago

I installed your stuff (pip install sphinxprettysearchresults) and added the extension to my conf.py file.

Now the search feature is trying to display files with the .rst.txt (instead of just .txt) extension, and ends up displaying 404 error messages for each search result. The search does find the matching pages, it's just not displaying the results because it's trying to use the wrong file extension.

Help.

TimKam commented 7 years ago

What's the Sphinx version you are using? You can check this with pip show sphinx. Try if updating to the latest Sphinx version helps. I suspect this is a Sphinx issue and not an issue with the extension, but I can investigate once I know which version you are using.

mherbold commented 7 years ago

Name: Sphinx Version: 1.5.5 Summary: Python documentation generator Home-page: http://sphinx-doc.org/ Author: Georg Brandl Author-email: georg@python.org License: BSD Location: /home/usked/python27/lib/python2.7/site-packages Requires: docutils, alabaster, snowballstemmer, Pygments, six, imagesize, Jinja2, babel, requests

mherbold commented 7 years ago

If you want to see what it's doing, take a look at:

https://alpha-dev.usked.com/manual/internal

And search for "available" - you'll see it finds a bunch of matches but is unable to display the prettified txt files because they all have the .txt extension in the _sources folder and not an .rst.txt extension.

TimKam commented 7 years ago

Thanks for reporting the details. I'll take a look.

TimKam commented 7 years ago

There is an issue with recent version of the default Sphinx search scripts. Quick fix: Load an addition JavaScript file after you load searchtools.js and add the following content to override the default terms search:

Search.performTermsSearch = function(searchterms, excluded, terms, titleterms) {
    var docnames = this._index.docnames;
    var filenames = this._index.filenames;
    var titles = this._index.titles;

    var i, j, file;
    var fileMap = {};
    var scoreMap = {};
    var results = [];

    // perform the search on the required terms
    for (i = 0; i < searchterms.length; i++) {
      var word = searchterms[i];
      var files = [];
      var _o = [
        {files: terms[word], score: Scorer.term},
        {files: titleterms[word], score: Scorer.title}
      ];

      // no match but word was a required one
      if ($u.every(_o, function(o){return o.files === undefined;})) {
        break;
      }
      // found search word in contents
      $u.each(_o, function(o) {
        var _files = o.files;
        if (_files === undefined)
          return

        if (_files.length === undefined)
          _files = [_files];
        files = files.concat(_files);

        // set score for the word in each file to Scorer.term
        for (j = 0; j < _files.length; j++) {
          file = _files[j];
          if (!(file in scoreMap))
            scoreMap[file] = {}
          scoreMap[file][word] = o.score;
        }
      });

      // create the mapping
      for (j = 0; j < files.length; j++) {
        file = files[j];
        if (file in fileMap)
          fileMap[file].push(word);
        else
          fileMap[file] = [word];
      }
    }

    // now check if the files don't contain excluded terms
    for (file in fileMap) {
      var valid = true;

      // check if all requirements are matched
      if (fileMap[file].length != searchterms.length)
          continue;

      // ensure that none of the excluded terms is in the search result
      for (i = 0; i < excluded.length; i++) {
        if (terms[excluded[i]] == file ||
            titleterms[excluded[i]] == file ||
            $u.contains(terms[excluded[i]] || [], file) ||
            $u.contains(titleterms[excluded[i]] || [], file)) {
          valid = false;
          break;
        }
      }

      // if we have still a valid result we can add it to the result list
      if (valid) {
        // select one (max) score for the file.
        // for better ranking, we should calculate ranking by using words statistics like basic tf-idf...
        var score = $u.max($u.map(fileMap[file], function(w){return scoreMap[file][w]}));
        results.push([docnames[file], titles[file], '', null, score, docnames[file]]);
      }
    }
    return results;
  };

I'll look into a more elegant solution hopefully in some days.

mherbold commented 7 years ago

OK - I'll try that... but wouldn't it be easier to just rename all .txt files to .rst.txt? I haven't tried that but I assume that would work.

TimKam commented 7 years ago

Yes, that is going to work. But it will break docs that rely on the current behavior. I'd like to take some time to come up with the best possible solution that is backwards compatible. Won't have time before the weekend.

mherbold commented 7 years ago

OK - my temporary fix is to just do a "find . -name '*.txt' -exec rename .txt .rst.txt {} +" from inside the generated _sources folder for now - it seems to work. Thanks for looking into this!

TimKam commented 7 years ago

Here's the link to the PR that introduces the breaking change: https://github.com/sphinx-doc/sphinx/pull/2454. It was released with version 1.5. The solution will probably be to adjust the file name/suffix based on the Sphinx version:

  1. If you use Sphinx >=1.5, the file name will be filename + source_file_ending + ".txt".
  2. If you use Sphinx < 1.5, the file name will be filename + ".txt".

Also, I will add a config option that allows forcing a fallback to 2). This will help folks like me, who use this extension for a long time and whose templates evolved accordingly ;-)

TimKam commented 7 years ago

I published a new release that fixes this issue.