simonw / simonwillisonblog

The source code behind my blog
https://simonwillison.net/
Apache License 2.0
203 stars 19 forks source link

Spellcheck if 0 search results #471

Closed simonw closed 4 months ago

simonw commented 4 months ago

If a search returns 0 results, try spell check correcting the search terms and see if that provides results - if so suggest the fix.

simonw commented 4 months ago

Experiment using https://pypi.org/project/pyspellchecker/

>>> from spellchecker import SpellChecker
>>> 
>>> spell = SpellChecker()
>>> spell.unknown("hello there".split())
set()
>>> spell.unknown("hello thereu".split())
{'thereu'}
>>> spell.unknown("hello thereu".split())
{'thereu'}
>>> spell.correction("thereu")
'there'
>>> 
>>> 
>>> spell.word_frequency
<spellchecker.spellchecker.WordFrequency object at 0x10154bca0>
>>> spell.word_frequency.load_words
<bound method WordFrequency.load_words of <spellchecker.spellchecker.WordFrequency object at 0x10154bca0>>
>>> spell.word_frequency.load_words("
KeyboardInterrupt
>>> 
>>> 
>>> spell.unknown(["willison"])
{'willison'}
>>> spell.word_frequency.load_words(["willison"])
>>> spell.unknown(["willison"])
set()
>>> spell.unknown(["willisn"])
{'willisn'}
>>> spell.correction("willison")
'willison'
>>> spell.correction("willisn")
'willison'

Then I could find all tags in my database, split on - and add those to the custom word frequency thing. If 0 results I could run the spell checker and suggest the corrected thing along with how many results it gets.

simonw commented 4 months ago

I could populate an in-memory spellcheck with tags from the DB and keep that in memory until the server restarts, or maybe until an hour has passed.

simonw commented 4 months ago

Docs: https://pyspellchecker.readthedocs.io/en/latest/

simonw commented 4 months ago

OK, it works. It's not brilliant but...

https://simonwillison.net/search/?q=promptt+injection

CleanShot 2024-07-16 at 22 14 55@2x

You have to get lucky with regards to which spell check it chooses though:

https://simonwillison.net/search/?q=promp+injection

CleanShot 2024-07-16 at 22 15 12@2x

simonw commented 4 months ago

I'm going to not bother showing the suggestion if it has zero results.