StampyAI / stampy-ui

AI Safety Q&A web frontend
https://aisafety.info
MIT License
34 stars 9 forks source link

Semantic search strings that don't return the expected result #246

Open LeMurphant opened 1 year ago

LeMurphant commented 1 year ago

see https://discord.com/channels/677546901339504640/1106385025177485362

Writers have noticed that sometimes using simple search terms on aisafety.info don't return results for the expected page that is live on site. This is a place to collect them and notify the devs.

Entries should take the following form: Search term: { term(s) } Expected result: {aisafety.info or google docs page}

As of 2023-05-25 no such string has been officially collected, I will make sure writers know to post these failed searches here.

markovial commented 1 year ago

searched term: pascals mugging expected result: Aren't AI existential risk concerns just an example of Pascal's mugging? returned result: none

markovial commented 1 year ago

searched term: academia expected result: How can I work on AGI safety outreach in academia and among experts? returned result: none

searched term: outreach expected result: How can I work on public AI safety outreach? returned result: none

searched term: mathematical, philosophical expected result: How can I do conceptual, mathematical, or philosophical work on AI alignment? returned result: none comment: as soon as I add just one more word and the semantic search kicks in instead of keyword I get the expected result

Aprillion commented 1 year ago

the exact match academia, outreach, ... look like a problem with cache => I need to find time to investigate the caching issues from #228 ... I just deleted the cache and it started to work again:

Screenshot 2023-06-02 at 14 35 05

feel free to ping me on Discord when we have a batch of new Live on site questions that cannot be found by single word exact match search 😅

non-exact match like pascals -> pascal's will need more discussion how to solve properly ... but this particular case might be good enough when you start typing pa:

Screenshot 2023-06-02 at 14 38 37
Aprillion commented 1 year ago

actually, looks like we solved the apostrophes too, so that one was also not working because of cache issues...

Screenshot 2023-06-02 at 14 39 58
markovial commented 1 year ago

I usually notice this kind of stuff only once a month when I am going through and creating the update lesswrong post, because I need to search up all the questions that go into that post. But since we push questions to live on site through the month as well, it might just be worth setting up a manual reminder to clear the cache every once a week or so. I don't really know what the negative consequences are as far as performance is concerned if we do it too often.

Aprillion commented 1 year ago

Caching issues from #228 are now fixed 🤞 so hopefully no more strange search results, but let's keep this ticket open in case we discover more problems...

LeMurphant commented 11 months ago

Searching for intelligence explosion does not return What is an "intelligence explosion" in the top 5 https://aisafety.info?state=6306_ intelligence_explosion The 5 results are relevant, but "what is" sounds more relevant

Aprillion commented 11 months ago

Searching for intelligence explosion

Dev note: 2 words => "baseline search", not "semantic search" (which uses small model that wasn't good for exact match of 1-2 words) ... both are the same "search" from user perspective, but fixing this case will involve some if/else code and not playing with hyper-parameters 😅

(in any case, still a good test case for semantic search API too)

Aprillion commented 11 months ago

Boosting the "What is ..." / "What are ..." questions in baseline search in #288:

image

LeMurphant commented 11 months ago

Search for "Metaculus" or "Metaculus' " does not return anything, but this article contains the sentence "Metaculus’ forecasts for..." Note that the search for "August 2023" also returns no results.

LeMurphant commented 3 months ago

Not sure if semantic search is enabled at the moment, but searching for "how will we know if AI is conscious" should return the "Are AIs conscious?" article conscious