Closed jonlee112 closed 2 years ago
Hi @jonlee112, i think this is related to an issue in the code where a bad request is repeatedly retried. I should be able to push a fix to pmaw
addressing this scenario
@mattpodolak Thank you so much! Would be much appreciated as I could gather a lot more data overnight when I can't be around to monitor the progress.
I've been having the same issue. I left a script running overnight and later realised it had made ~100,000 requests for the same 3 comments.
It can be consistently recreated with:
results = api.search_comments(subreddit='incels', after=1469919600, before=1470006000)
fixed this issue in 2.1.2
Many thanks @mattpodolak !
Sometimes a request within a loop of requests ends up going through hundreds (thousands?) of batches trying to locate 1 single comment. I'd rather just skip that comment and move on to the next request. Any way to implement this in the code?
example output: INFO:pmaw.PushshiftAPIBase:Checkpoint:: Success Rate: 100.00% - Requests: 19 - Batches: 10 - Items Remaining: 1 INFO:pmaw.PushshiftAPIBase:Checkpoint:: Success Rate: 100.00% - Requests: 29 - Batches: 20 - Items Remaining: 1 INFO:pmaw.PushshiftAPIBase:Checkpoint:: Success Rate: 100.00% - Requests: 39 - Batches: 30 - Items Remaining: 1 INFO:pmaw.PushshiftAPIBase:Checkpoint:: Success Rate: 100.00% - Requests: 49 - Batches: 40 - Items Remaining: 1 INFO:pmaw.PushshiftAPIBase:Checkpoint:: Success Rate: 100.00% - Requests: 59 - Batches: 50 - Items Remaining: 1 INFO:pmaw.PushshiftAPIBase:Checkpoint:: Success Rate: 100.00% - Requests: 69 - Batches: 60 - Items Remaining: 1 etc etc etc etc