igraph / python-igraph

Python interface for igraph
GNU General Public License v2.0
1.28k stars 247 forks source link

Documentation search can be confusing #611

Closed szhorvat closed 12 months ago

szhorvat commented 1 year ago

Here I'm noting down some potential barriers to using the docs and the doc search, that we discovered with @iosonofabio a while ago.

First, note that the search box on https://igraph.readthedocs.io/ does not search the API. In order to search the API, one must go to https://igraph.readthedocs.io/en/0.10.2/api/index.html There, we get live results while typing into the search box.

Suppose we search for betweenness, and I type "betw". This is what I see:

image

Notice that it says, No results matches "betw" At this point, some people will assume that if there are no result for betw, there won't be results for betweenness either and stop typing. It turns out that one must type at least between to start getting result.

image

This is not at all obvious though.

ntamas commented 1 year ago

First, note that the search box on https://igraph.readthedocs.io/ does not search the API.

That's an unfortunate side effect of trying to use a PyDoctor theme that matches the default theme of the "main" documentation generated by Sphinx independently of PyDoctor. (Basically, the documentation you see is actually two documentations stitched into one). Note how PyDoctor does the same thing; this is the main readthedocs page of PyDoctor: https://pydoctor.readthedocs.io/en/latest/ -- and this is the API doc: https://pydoctor.readthedocs.io/en/latest/api/index.html

The reason why it's less confusing is because the two documentations actually use different themes so you don't get tricked into thinking that the search bar searches both.

As for the minimum character count needed to trigger the search operation, I'll look into this and check whether there's a way to lower the minimum character count for Lunr.

szhorvat commented 1 year ago

As for the minimum character count needed to trigger the search operation, I'll look into this and check whether there's a way to lower the minimum character count for Lunr.

This is really the main issue I wanted to note here. I'm not sure if it's a character count or something else.

ntamas commented 1 year ago

This seems like it's not a character count issue but something else (maybe a similarity threshold)? Searching for close yields results much earlier than between.

ntamas commented 1 year ago

Also note that searching for bet* switches Lunr from term search mode to substring search mode and then you start getting results much earlier.

ntamas commented 1 year ago

Spent a bit more time investigating this and it seems to be by design; Lunr defaults to matching keywords and it won't return matches below a certain similarity score threshold. Raising the threshold is only a workaround and it introduces problems elsewhere as lots of unrelated matches will turn up for short keywords. The real solution is to acknowledge that our expectation is to do a wildcard match, i.e. that when the user enters betw, what he really means is "give me all entries that start with betw", which should be specified as betw* in Lunr syntax.

We can do this by patching PyDoctor such that it submits betw* to the underlying search engine when the user types betw, but what shall we do with multi-word queries? Shall we split it along spaces and append an asterisk to the end of each word?

szhorvat commented 1 year ago

A particularly bad case is typing transitivity which gives zero results despite the existence of transitivity_avglocal_undirected(), transitivity_undirected(), transitivity_local_undirected()

ntamas commented 1 year ago

I have been looking into this. We would need to patch search.js in the documentation generated by PyDoctor to make the behaviour more intuitive, and any patch that we come up with would be fragile - if PyDoctor changes the source of search.js, the patch would not apply cleanly any more. I have noticed a similar issue in PyDoctor's issue tracker so I added a comment there in the hope of getting the developers' attention so maybe we could implement an opt-in tweak to the behaviour of the Lunr engine in the next release.

tristanlatr commented 1 year ago

This issue will be fixed by using the next version of pydoctor: 23.4.0 (which is still unreleased at this time). Thanks for bringing that up.

note that the search box on https://igraph.readthedocs.io/ does not search the API.

This behaviour will not be changed, though. See this issue https://github.com/twisted/pydoctor/issues/632 for improvement regarding sphinx search integration.

tristanlatr commented 1 year ago

FYI, pydoctor 23.4.x is released.

stale[bot] commented 12 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

ntamas commented 12 months ago

The new documentation on Readthedocs now gives results for both betw and tran as search queries, so closing this issue.