TeX-Live / texdoc

Find and view documentation in TeX Live
https://tug.org/texdoc/
GNU General Public License v3.0
47 stars 8 forks source link

New config item to control behavior when there are only negative matches #98

Open wtsnjp opened 1 year ago

wtsnjp commented 1 year ago

The following values can be given to this config item:

negative_match = ask | showall | online

The name of the config item will be adopted if we have a better idea.

I would like to set the default value to ask so that it behaves the same as v4.0.

This config item should have an effect in mixed mode as well as view mode.

Originally posted by @kberry in https://github.com/TeX-Live/texdoc/issues/96#issuecomment-1444664153

wtsnjp commented 1 year ago

Another value view to show the top ones anyway, even if there are only negative matches. This is the identical behavior as in v3.x.

wtsnjp commented 1 year ago

When the value is other than ask, it would be better to display a warning as in the case of going to showall in the current list mode.

$ texdoc -l topic
texdoc warning: No good result found, showing all results.
 1 /usr/local/texlive/2023/texmf-dist/doc/latex/bibtopic/bibtopic.pdf
   = Package documentation
...
kberry commented 1 year ago
When the value is other than `ask`, it would be better to display a
warning as in the case of going to showall in the current list mode.
...
texdoc warning: No good result found, showing all results.

I agree, FWIW.

negative_match = ask | showall | online

Another option could be "displayfirst" (or some such name), to restore the original behavior of showing the first match, as in previous releases.

This was the request (by someone else) which prompted me to start writing ... -k

gucci-on-fleek commented 1 year ago

I'm starting to think that this (and #94) is maybe more of an issue with the scoring than with how the results are displayed.

Example:

$ texdoc -lM qwert # unstable results here
qwert   7.0 /usr/local/texlive/2022/texmf-dist/doc/fonts/bera/bera.pdf      Font samples
qwert   1.8 /usr/local/texlive/2022/texmf-dist/doc/fonts/bera/README        Readme
qwert   1.6 /usr/local/texlive/2022/texmf-dist/doc/fonts/bera/bera.txt      

$ texdoc -lM texbytopi
texdoc warning: No good result found, showing all results.
texbytopi   -8.5    /usr/local/texlive/2022/texmf-dist/doc/plain/texbytopic/TeXbyTopic.pdf      The book itself
texbytopi   -9.8    /usr/local/texlive/2022/texmf-dist/doc/plain/texbytopic/README      Readme

$ texdoc -lM zezbytopiz
zezbytopiz  7.0 /usr/local/texlive/2022/texmf-dist/doc/plain/texbytopic/TeXbyTopic.pdf      The book itself
zezbytopiz  1.8 /usr/local/texlive/2022/texmf-dist/doc/plain/texbytopic/README      Readme

With the first case, no one will really care what message we show since there aren't really any sensible results for the search. What would be bad here would be to just open a random document with no notice, but that is what is happening here.

For the second case, texdoc "should" be able to open the correct document automatically. Showing any message here would be fairly annoying, but that is what is happening here.

The third case shows texdoc being helpful and correcting my terrible spelling. I think that it's a tossup here for if an "ideal" texdoc should ask first or just open up the first matching document. Also interesting is how the scores here compare to the scores in the second case.

(Also, this is not a dig at texdoc's scoring. I find that texdoc works really well in practice; these are just some surprising edge cases)

I'm thinking that if fuzzy/Levenshtein searching is penalized more and the fuzzy search algorithm also considers adding letters (maybe it does already but it's just penalized too much?), we might not need to add any config options.

(#96 seems like a good idea to me regardless)

wtsnjp commented 1 year ago

I think the second case (texdoc -lM texbytopi) is the one that really needs improvement. Could you please create a separate issue?

The fuzzy search I introduced in Texdoc v3.0 is incomplete. The fuzzy search is only executed when Texdoc fails to find anything in the standard search; in the second case, the fuzzy search is not working because something was found even though only negative matches were found. This is why the results are worse than in the third case (texdoc -lM zezbytopiz).

The fuzzy search should be triggered even when only bad results are found. More to the point, you could always use fuzzy search in any case and, as you say, give an appropriate penalty based on edit distance. However, the latter would require more drastic changes than the former.

In any case, I don't think it will eliminate the cases where only negative matches are found, no matter how hard you try to calculate the score. In such cases, I think adding this config item is worthwhile, as there are those who do not want to see any online search options and those who want to do online searches. Improving score calculation and improving the way negative matches are displayed should both be done independently.