signalapp / Signal-Desktop

A private messenger for Windows, macOS, and Linux.
https://signal.org/download
GNU Affero General Public License v3.0
14.7k stars 2.69k forks source link

Unfortunate search result in the emoji search #4294

Open yb66 opened 4 years ago

yb66 commented 4 years ago

Bug Description

When I search for "pirate" in the emoji search box I get 2 results, the expected skull and crossbones flag, and the flag of the United Arab Emirates.

I can see how it might happen if I squint a bit (some kind of soundex function?) but I can also see how it might offend people and/or appear deliberate.

Steps to Reproduce

  1. Click on emoji button
  2. Click on search tool in menu
  3. Type in pirate

Actual Result:

2 results, skull and crossbones flag and UAE flag.

Expected Result:

No UAE flag, and (at least) the skull and crossbones flag.

Screenshots

Screenshot 2020-05-21 at 15 11 08

Platform Info

Signal Version:

v1.33.1

Operating System:

Mac 10.14.6

Linked Device Version:

Link to Debug Log

Regards, iain

jsantell commented 4 years ago

Looks like the emoji search uses fuse.js, and can be reproduced with their demo using Signal's configuration. Even te matches pirate and flag_ae here due to the fuzzy search (IIUC similar cause to #4235)

yb66 commented 4 years ago

Hi @jsantell,

Thanks. Where would I get the emoji data to load in the demo? I can't find it. Perhaps one of the npm packages (https://github.com/iamcal/emoji-data)?

I'll cross post this to the fuse.js issue tracker once I've got the demo to work so they can reproduce/test it.

Regards, iain

jsantell commented 4 years ago

The emoji data is at ./sticker-creator/dist/bundle.js via emoji-datasource and emoji-datasource-apple modules, but can be reproduced by swapping out the strings in the fuse demo. I imagine the fuzzy search is working correctly (i.e. not an issue with fuse), this is just the outcome of lax settings in a fuzzy search

yb66 commented 4 years ago

Thanks @jsantell, much appreciated.

Regards, iain

KeronCyst commented 4 years ago

Can confirm that the UAE flag is pulling up on Windows 10 as well (1.36.3).

yb66 commented 4 years ago

I didn't get very far with getting the demo to work, if anyone else can that would be good.

Regards, iain

hiqua commented 4 years ago

I didn't get very far with getting the demo to work, if anyone else can that would be good.

Regards, iain

You can check my example there as a starting point: https://github.com/signalapp/Signal-Desktop/issues/4235#issuecomment-738821978

hiqua commented 4 years ago

Looks like the emoji search uses fuse.js, and can be reproduced with their demo using Signal's configuration. Even te matches pirate and flag_ae here due to the fuzzy search (IIUC similar cause to #4235)

From what I gather fuse.js will match anything, it's just that it will give a very bad score (i.e. 1.0, and 0.0 is perfect match) if the match is not a real one.

In the case of this example, using your links (thanks!), we can see that te matches flag_ae with a score of 0.55, so a very bad match (I'm guessing anything at 0.5+ does not really match from the quick look I had).

There is a threshold parameter which is supposed to discard these bad matches. In the case of Signal, it's set to 0.2, so this result should not even appear.

So my guess is that Signal-Desktop is somehow reusing the previous results instead of recomputing the fuzzy search, which would not suggest flag_ae.