Closed rhiever closed 11 years ago
there's so many submissions to scrape that the script dies with a HTTP 500 Interval Server
That's not the issue. The problem is reddit has some broken comment trees currently which result in 500 errors on some submissions. You just need to apply a similar fix as to what I did for subreddit_stats: https://github.com/praw-dev/prawtools/commit/3227acb23435d14540f8bda8e8a60f66980d1b78#L1R207
Probably just put the try/except around the processSubmission
call in processSubreddit
.
Ahhh, thanks for pointing that out! Will split this into two issues then -- one to fix the big, one to to add the command line param.
Not seeing your comment anymore -- but you only need to wrap processSubmission
in the processSubreddit
code as processSubmission from processUser
doesn't actually fetch the submission thus won't receive the 500 error unless something is really broken on reddit, in which case the results will likely be incredibly incorrect and you'll want to re-run.
I removed my comment b/c I saw you already answered it above. :-)
Just curious why you're making pull requests to merge? It's great to work in a separate branch, but having direct-push access means you can directly merge without ever pushing the temporary branch remotely.
Just practicing working in a separate branch. Probably better practice in the long run?
Makes sense. I just wanted to check if you wanted me to follow the same procedure if/when I make changes. Given your response, it seems skipping the PR step is okay.
Mostly trying to avoid commit issues like we ran in to earlier. Those are icky.
Mostly trying to avoid commit issues like we ran in to earlier. Those are icky.
It's definitely going to happen from time to time no matter how much branching you do. The key take-a-way is if the merge can't happen automatically then you need to proceed with caution. For instance you could have replied that the merge wouldn't auto commit, and I could have updated it such that it would (that's essentially what I did with the branch you reset to). You probably noticed I didn't get it completely correct as I forgot to propagate the file deletion as part of my merge.
:+1:
Sometimes in huge subreddits, there's so many submissions to scrape that the script dies with a HTTP 500 Internal Server Error.
(see comments below)