mattpodolak / pmaw

A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.
MIT License
213 stars 28 forks source link

Fetching comments never completes #43

Open kumarde opened 2 years ago

kumarde commented 2 years ago

Hi there!

I have a very simple PMAW script:

import sys
from pmaw import PushshiftAPI
import json
import os

api = PushshiftAPI(num_workers=os.cpu_count(), rate_limit=100)

id_f = sys.argv[1]
comment_ids = [l.strip() for l in open(id_f, 'r')]
comments_arr = api.search_comments(ids=comment_ids)
for c in comments_arr:
    print(json.dumps(c))

The script takes in IDs from a file, puts them into a list, and passes them into search_comments.

This script has worked fine once (en masse for lots of IDs), but for some reason, it now never completes on even a test set of 10 IDs. I'm certain I must be doing something silly, or the API has potentially changed. Could someone point me in the right direction?

Thanks!

mlinegar commented 1 year ago

I'm running into this as well! Did you ever figure out a workaround?

venkatasg commented 1 year ago

You should always check if the Pushshift API is down - that turned out to be my problem.