Closed christian-byrne closed 2 months ago
The suggested solution was to keep useExtendedSearch
enabled, since the tokenization improves accuracy of mixed-word-order searches, and to implement a custom sort function that sorted matches with very similar scores based on length. It was also suggested that the issue should be solved by using Bayesian search with telemetry data.
Here's my suggestion for the custom sort function that seems to work well:
Math.abs(a.score - b.score) > 0.01 ? a.score - b.score : (a.item[1]['v']['length'] - b.item[1]['v']['length'] || a.idx - b.idx)
Here's my suggestion for the custom sort function that seems to work well:
Math.abs(a.score - b.score) > 0.01 ? a.score - b.score : (a.item[1]['v']['length'] - b.item[1]['v']['length'] || a.idx - b.idx)
Comparison of current method vs. sorting function using all token permutations. It seems better in almost every scenario.
Thanks for the comparison function and benchmark!
I think the sort function is a bit off in some cases:
I think the sort function is a bit off in some cases:
Are you getting "CLIP Text Encode (Prompt)" as the first option normally?
For me this is with vs. without sorting function:
I only get "CLIP Text Encode (Prompt)" as first result after I disable the extended search in fuse config.
Extended search vs. no extended search comparison
Extended search has a lot of upsides when the ordering of words is incorrect in search phrase e.g. "Encode VAE" or "Image Save"
I do get the correct one when "CLIP Text Encode" is used as query before. And that one is used in the playwright test unfortunately.
I see, it's "CLIPTextEncode." I misunderstood
the reason why it's correct in the test you mentioned, is that when you type the exact node id, the old method is better in this outlier case, if you're missing even one letter, the old method results are equally bad.
Fixing exact matches can be done fairly easily by adding an additional condition Math.min(a.score, b.score) < 0.0001
:
sortFn: (a, b) => Math.min(a.score, b.score) < 0.0001 || Math.abs(a.score - b.score) > 0.01 ? a.score - b.score : (a.item[1]['v']['length'] - b.item[1]['v']['length'] || a.idx - b.idx)
I have another version that uses the node index instead of node title length. this one is "less correct", since it doesn't prioritize match accuracy, but in exchange it is closer to the og. sort order of the old UI. I think it's determined by the order in which the nodes are loaded? I'm not exactly sure. it also fixes many of the same problems the previous version aimed to fix.
this one basically prioritizes core nodes first, and based on their definition/load order. as a downside it's slightly worse for custom nodes, as the load/definition order of those is pretty "random". it should yield results that are more familiar, even if less accurate.
sortFn: (a, b) => Math.abs(a.score - b.score) > 0.01 ? a.score - b.score : a.idx - b.idx
Frontend Version
1.2.28
Expected Behavior
Search to behave similar to other search features.
Actual Behavior
From the fuse.js docs on extended search option, when enabled:
This leads to unexepcted behavior when searching multiple words, like getting this when searching "Save Image":
Or this when searching "VAE Decode":
This was pointed out by SirVeggie on discord.
Steps to Reproduce
Debug Logs
Browser Logs
.
What browsers do you use to access the UI ?
Google Chrome
Other
When extended search is enabled, "VAE Decode" matches to "VAE Decode" with a score of 0.06403390388180329 and "VAEDecodeAudio" with a score of 0.0155. When extended search is disabled, "VAE Decode" matches to "VAE Decode" with a perfect score (0) and "VAEDecodeAudio" with a score of 0.3.