mattpodolak / pmaw

A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.
MIT License
212 stars 28 forks source link

Score parameters #20

Closed Hastte closed 3 years ago

Hastte commented 3 years ago

Hello When i try api.search_submissions(subreddit="science", limit=None, score>=1000 I get an errorSyntaxError: positional argument follows keyword argument Do you know how to fix it ? Thanks

mattpodolak commented 3 years ago

Hi @Hastte, can you try the following: api.search_submissions(subreddit="science", limit=None, score=">1000")

Let me know if this works

Hastte commented 3 years ago

Yes that's kinda work buuut, i get less result than usual using pushshift api directly Edit: when i put the limit to 20k for exemple, i got 20 results and if i put the limit to 10k i got 200 result BUT with 40 above 20k

mattpodolak commented 3 years ago

Can you share your queries?

I tested the score parameter and got the results I expected, using Pushshift directly we can see 16 results exist (https://api.pushshift.io/reddit/search/submission/?q=dodo%20bird&metadata=true&score=%3E1000).

        "execution_time_milliseconds": 98.48,
        "index": "rs",
        "metadata": "true",
        "q": "dodo bird",
        "ranges": [],
        "results_returned": 16,
        "score": [
            ">1000"
        ],
        "shards": {
            "failed": 0,
            "skipped": 0,
            "successful": 24,
            "total": 24
        },
        "size": 25,
        "sort": "desc",
        "sort_type": "created_utc",
        "timed_out": false,
        "total_results": 16

Using the same query, re-written in pmaw, we get 16 results:

posts = api.search_submissions(q="dodo bird", score=">1000") # reports 16 available results
len(posts) # 16
Hastte commented 3 years ago

image image image

And now it's the opposite

mattpodolak commented 3 years ago

Your score parameter for Pushshift is malformed, "score>20000" should be "score=>20000".

This is the correct query: https://api.pushshift.io/reddit/search/submission/?subreddit=science&metadata=true&score=%3E20000

It has the same number of results as pmaw so I will be closing this issue.