Open MarkKoz opened 5 years ago
Is this proposing fuzzymatching along with disambiguation by showing all available choices, or just disambiguation of all exact partial matches.
Also with the new docs improvement that was recently merged (#538) there's a footer present showing extra possible matches of the current given argument that are in other namespaces. For example on dict
it outputs the stdlib datatype and shows the following:
This might be quite a bit too unnoticeable for some though, especially after the recent embed style update on discord which dulls the footer down to make it much less noticeable.
Is this proposing fuzzymatching along with disambiguation by showing all available choices, or just disambiguation of all exact partial matches.
I think the latter is better.
I'm not sure if this is a good idea, but it could go as far as to look for exact matches for parts of the path delimited by the dot. For example, a query of text
would not match discord.TextChannel
, only TextChannel
would. It'd have to be smart enough to know when to display a single page vs a bunch of results. For example, if a search query was TextChannel
, then it would return only discord.TextChannel
rather than that plus every single attribute and function of the object.
I'm interested in looking into this but before any searching is done on the inventories we need to clean them up.
Briefly touched upon this in a comment for #546 :
The inventories we get from some sources contain link to generated content which seems to be duplicated or at least unreachable through the bot because of the current parsing
(pandas-core-groupby-groupby-transform vs pandas.core.groupby.GroupBy.transform where the first one is unrechable through the bot)
Doing a simple search for /
and
which make the tag unreachable yields us around a quarter of the urls
You're welcome to work on it. Sorry if this will sound dismissive, but I don't understand what kind of issues the parsing has. In any case, you can clean up whatever is needed to make it work better.
The generated mostly miss an #
at the end which is needed to get the HTML tag to search from, but we don't need to preserve them since they should be duplicates and nobody is going to search for a symbol with a /
in its name.
Looking more at the symbols with spaces there's only around a hundred of them and they are mostly terms (binary file
, generator expression
) bit out of scope here but should the arg be changed to a greedy one? Or taking it as with the /
that nobody is going to attempt to search for those
What kind of drawbacks would a greedy arg have? If none, then why not?
@Numerlor I'm curious if you still plan on (or already are) working on this.
@Numerlor I'm curious if you still plan on (or already are) working on this.
I have it mind, but haven't worked on it beyond some experimenting with the previous PR. It'll be a bit before I can fully pay attention to figuring out how to make this work nicely so anyone's free to pick it up if they want to
I could work on this feature
If this feature were to be implemented, can I get to which form the output of the closest matches would be? I was thinking if there is more than one result, it will add reactions and according to which one you react to it will edit the message accordingly.
Actually, I can't find the time to work on this going to be pretty busy in the upcoming days
Could I be assigned to this now? I'm willing to work on it
So, to implement this, I think the best way to approach would be slash commands; using the autocomplete feature would show all matching docs to the input and since it can show several matches I wouls say it removes most if not all typing errors when trying to get a documentation. Could I get some feedback on this please
From a UX point of view, I find the suggestions on slash commands to be too slow to be considered good, but it’s better than the alternative of providing no suggestions.
That said, we aren’t porting commands to slash commands just yet.
Sometimes one may have an idea of what the name of an object/attribute is but not which namespace it's in. It would be helpful to have a command which can return possible matches for a given term. I think the best format would be a paginated embed with a list of all matched names that are hyperlinked to their docs. Perhaps this would need to be a separate command so that the library name can be passed as a separate argument.