pushshift / api

Pushshift API
1.29k stars 107 forks source link

Subreddit and author restrictions include results beyond an exact name match #144

Open brendon-wong opened 1 year ago

brendon-wong commented 1 year ago

Hello! I was trying to restrict results to a specific subreddit "web10" and I noticed that content from other subreddits with a name that includes the text "web10" is also coming up! I assume this is not the intended behavior because it could cause a lot of unrelated results to appear, especially if the subreddit name has a common word like "science" which is used by many subreddits beyond r/science.

In this example (https://api.pushshift.io/reddit/search/submission?subreddit=web10), content from r/web10, r/u_Psychological-Web10, and r/u_ronaldo-web10 appears. r/u_Psychological-Web10 is a subreddit, and r/u_ronaldo-web10 is handled differently by Reddit (Reddit displays a page indicating the user has been banned, rather than a page indicating the subreddit doesn't exist) so perhaps it was previously a subreddit.

I mistakenly opened this issue in the pmaw repo, and someone reported an issue with the author restriction as well: https://github.com/mattpodolak/pmaw/issues/60