art-w / sherlodoc

Fuzzy type search for OCaml documentation
MIT License
73 stars 6 forks source link

Split name part of the query by `.` in addition to space. #23

Open panglesd opened 8 months ago

panglesd commented 8 months ago

This allows to find ListLabels.map when searching for List.map.

Remarks:

What do you think?

art-w commented 8 months ago

Nice thanks! I've caught myself wanting to write Foo.bar out of muscle memory, when Foo bar was actually a better search (since the bar value is often hidden in a submodule)... but at the same time I liked that the extra "dot" precision in the query was rewarded by more precise search results!

But now that I think about it, the Query.Name_cost already favors "nice" word boundaries (eg with dots on either side of List and map) and the fact that words are found in query order (so Foo_map.List is worse), so I believe that by accident the exact matches will still be favored!

edit: Your PR has been deployed to doc.sherlocode.com so that we can try it out :)

If there's no big issue, I think it make sense to split the names by . during indexing as it should result in a nice database size improvement! (... or not) We had to comment the test for the opam release as the exact sizes varies depending on the OCaml version but you could test it here: https://github.com/art-w/sherlodoc/blob/648c75631dcbabff83324abd382b2eb20aad9204/test/cram_static/base_web.t#L13

art-w commented 8 months ago

After testing this a bit, I really like it :)

There's a tiny issue when searching for +. as the dot is lost, but it's not a blocker as searching for operators already has similar issues that we should address in a future PR https://github.com/art-w/sherlodoc/issues/26

panglesd commented 8 months ago

Thanks! Very cool that it is already deployed!

If there's no big issue, I think it make sense to split the names by . during indexing as it should result in a nice database size improvement! (... or not)

I'll do that and test the change in size, and undraft the PR when it's done :)