Ranchero-Software / NetNewsWire

RSS reader for macOS and iOS.
https://netnewswire.com/
MIT License
8.4k stars 530 forks source link

Search not returning an expected result #3531

Open jsit opened 2 years ago

jsit commented 2 years ago

Searching my feeds for bereal, this article isn't returned in the results:

https://lifehacker.com/what-is-the-new-social-media-app-bereal-and-do-you-r-1848780094

Searching for new social media app does return the article.

It comes from this feed:

https://lifehacker.com/tag/software/rss

And the full XML content of the feed item is here:

<item><title><![CDATA[What Is the New Social Media App ‘BeReal’ (and Do You Really Want to Be Real)?]]></title><link>https://lifehacker.com/what-is-the-new-social-media-app-bereal-and-do-you-r-1848780094</link><description><![CDATA[<img src="https://i.kinja-img.com/gawker-media/image/upload/s--QFtUR1tQ--/c_fit,fl_progressive,q_80,w_636/e136041367d9b21155180e322b0d91ce.jpg" /><p>There are always new social media platforms launching, but few have the staying power of Facebook, Instagram, Twitter, Snapchat, TikTok, <del>Truth Social,</del> WhatsApp, and Telegram. Each day seemingly promises us some new platform, most of which we ignore or maybe use for a brief time before ultimately ignoring; others have…</p><p><a href="https://lifehacker.com/what-is-the-new-social-media-app-bereal-and-do-you-r-1848780094">Read more...</a></p>]]></description><category>mobile software</category><category>app store</category><category>internet culture</category><category>fads and trends</category><category>google</category><category>mobile applications</category><category>computing</category><category>instagram</category><category>bereals</category><category>technology</category><category>software</category><category>selfie</category><category>pinterest</category><category>tiktok</category><category>snapchat</category><category>snapchat</category><category>video software</category><pubDate>Tue, 12 Apr 2022 16:00:00 GMT</pubDate><guid isPermaLink="false">1848780094</guid><dc:creator><![CDATA[Lindsey Ellefson]]></dc:creator></item>
Wevah commented 2 years ago

I wonder if it's because of the quotes around the word; if so that should be fixed.

brentsimmons commented 2 years ago

Agreed — it’s most likely the quotes

Wevah commented 2 years ago

Looks like updating the search table from fts4 to fts5 should fix this (it seems to be properly Unicode character class-aware). Not sure how to best do a migration, though; may need to do a complete reindex.

Wevah commented 2 years ago

Another option would be to strip codepoints in the punctuation class out before indexing.