snarfed / bridgy

📣 Connects your web site to social media. Likes, retweets, mentions, cross-posting, and more...
https://brid.gy
Creative Commons Zero v1.0 Universal
717 stars 52 forks source link

reddit link search is too broad #1025

Open snarfed opened 3 years ago

snarfed commented 3 years ago

our reddit link search currently uses selftext:"DOMAIN" to search for links in post text, but that has false positives. eg https://brid.gy/reddit/lgats has domain luke.lol, and https://www.reddit.com/search/?q=selftext%3A%22luke.lol%22 returns posts like https://www.reddit.com/r/summerhousebravo/comments/mdl6v6/ciara_sucks_we_can_say_it/ , which just includes the words Luke and lol far away from each other.

are there other search operators or syntax that do what we want? double quotes evidently don't. see https://www.reddit.com/wiki/search

from #1003:

Note that there is no guarantee that a selftext: match actually contains a link however the false-positives will be filtered out when searching for links to send the webmention.

...not quite true, since we're promiscuous and send wms to everything. not harmful, just a bit wasteful, which we're trying to cut down in #1021.

edent commented 1 year ago

I'm not sure if it's helpful, but you can search for posts rather than mentions using /domain/

For example, https://www.reddit.com/domain/shkspr.mobi/

snarfed commented 1 year ago

Interesting! Looks like that's a specific part of the web UI though. I'm not seeing much effect on search, eg https://www.reddit.com/search?q=%2Fshkspr.mobi%2F , and we use the search API.