praw-dev / praw

PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.
http://praw.readthedocs.io/
BSD 2-Clause "Simplified" License
3.46k stars 456 forks source link

Lucene (default) search does not return correct results for phrase searching (with double quotes) #837

Closed taylorkline closed 7 years ago

taylorkline commented 7 years ago

Issue Description

The following search produces zero results:

query = "NOT (flair:expired OR flair:meta) \"yunnan sourcing\" OR yunnansourcing OR yunnansourcing.com"
reddit.subreddit("teasales").search(query, sort="new", time_filter="month")

Despite the web-based search page returning 1 result.

It would appear that the praw search or the Reddit API is not supporting the phrase search quotation marks correctly.

System Information

PRAW Version: 5.0.1 Python Version: 3.6 Operating System: Debian Testing

matthew0x40 commented 7 years ago

The API and the web-based search page are most likely returning different results due to them using different search engines.

About a month ago, Reddit launched a new search stack for their web-based search, but the Reddit API is still using the old search stack to ensure backwards compatibility. Currently there's no ETA as to when the new endpoint will be available.

For now you can try fielding your search query: NOT (flair:expired OR flair:meta) author:yunnansourcing OR selftext:\"yunnan sourcing\" OR selftext:\"yunnansourcing.com\" OR url:yunnansourcing

taylorkline commented 7 years ago

You got it @kwwxis!

NOT (flair:expired OR flair:meta) selftext:"yunnan sourcing" OR title:"yunnan sourcing" OR author:"yunnansourcing" OR selftext:"yunnansourcing" OR title:"yunnansourcing" OR site:"yunnansourcing.com" OR selftext:"yunnansourcing.com" OR title:"yunnansourcing.com"

...does the trick for a comprehensive search, returning the one result I expect. Thanks!