git / git-scm.com

The git-scm.com website. Note that this repository is only for the website; issues with git itself should go to https://git-scm.com/community.
https://git-scm.com/
MIT License
2.18k stars 1.23k forks source link

Searching "recurse-submodules" returns incomplete results #1374

Open phil-blain opened 5 years ago

phil-blain commented 5 years ago

I was searching for all git commands that understand the --recurse-submodules flag, so I searched for "recurse-submodules" on the web site.

I don't know why, but the search results do not include neither the man page for git read-tree, nor the one for git switch which both support the flag: https://git-scm.com/docs/git-read-tree#Documentation/git-read-tree.txt---no-recurse-submodules,
https://git-scm.com/docs/git-switch#Documentation/git-switch.txt---recurse-submodules

I think these are the only ones that search doesn't return, since the search page has 10 hits under docs/ and doing

git grep -l  "recurse-submodules" -- Documentation/

in git.git returns 12 files.

So I guess something is wrong with the searching/indexing...

peff commented 5 years ago

I think part of the problem is that we don't re-index the manpages often enough (or really, automatically at all). I'm slightly hesitant to reindex them every night, since 99% of the time they don't change. It would be nice if the index job could tell when latest_version was the same and make the job a noop. Or maybe just accept the extra processing. It's not that expensive (it's on the order of 15 seconds of CPU).

I just reindexed, and now the git switch manpage shows up. Curiously read-tree doesn't seem to. We seem to cap the result at 10 items per source, though, with no option for pagination. So I'd guess that's the issue there (I didn't look in the code, but searching for something stupidly obvious like "Git" returns exactly 10 hits from the book and 10 hits from the manpages).

phil-blain commented 5 years ago

Thanks for the quick check, and the reindexing. It would be good for cases like this to do pagination I guess. The website is usually the first place I go to read the man pages, so having search returning all matches would be ideal.

phil-blain commented 4 years ago

So I guess https://github.com/git/git-scm.com/blob/2f81e0ce42cb616ce8dba9b3e76df93bfbbf9465/lib/searchable.rb#L20

is the culprit for capping at 10 results. I wanted to test it but could not get the search to work locally, is that normal ? The README does not mention anything special about that...

pedrorijo91 commented 4 years ago

search functionality uses elastic, so you need to have a local elastic instance running. I did that once (for #1282 ) , but I can't really remind of the details 🤔 I probably should have updated the docs for future situations like this one

dscho commented 1 month ago

the search results do not include neither the man page for git read-tree, nor the one for git switch which both support the flag

I just verified that the upcoming Hugo/Pagefind site is able to find them (although admittedly very far down the line...).