mattpodolak / pmaw

A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.
MIT License
212 stars 28 forks source link

Comment and submission search snags #56

Open jarohde opened 1 year ago

jarohde commented 1 year ago

Hello, I've been noticing in the last couple of days that pmaw will frequently snag on batches of either submission or comment searches. It always seems to eventually pull the data, but extracting a small set of ~500 posts can take anywhere between 10 seconds and 5 minutes. Any thoughts on why?

I modified pmaw to print out the URL endpoints to see if a particular search is causing the problem, but clicking on the pushshift links shows the data in the browser incredibly fast.

I'm attaching a recent example of a pull of 500 posts with the query "test." It took a little over two minutes to pull the data. I also highlight the link with a red arrow that caused pmaw to snag for most of this request. Any thoughts on what might be causing this?

Thanks for maintaining pmaw -- it's such a great tool!

pmaw_example