Open carcinocron opened 10 years ago
I could be totally wrong though. My logs don't show anything. While writing this, I realized this:
# Download the image
download_from_url(URL, FILEPATH)
# Image downloaded successfully!
print ' Downloaded URL [%s] as [%s].' % (URL.encode('utf-8'), FILENAME.encode('utf-8'))
DOWNLOADED += 1
FILECOUNT += 1
Which means that a failed download wouldn't get logged anyways, because the URL is logged after the download is successful. I just made the following change:
print ' Attempting to Download URL [%s] as [%s].' % (URL.encode('utf-8'), FILENAME.encode('utf-8'))
# Download the image
download_from_url(URL, FILEPATH)
# Image downloaded successfully!
print ' Downloaded URL [%s] as [%s].' % (URL.encode('utf-8'), FILENAME.encode('utf-8'))
DOWNLOADED += 1
FILECOUNT += 1
Now my logs should have the URL stated before the download attempt fails (and all the evidence is lost)
So that hopefully I can follow up with better information
That seems like a reasonable change, I've added it to the script now. Did you get any further tracking down your 404 issue?
I haven't modified the code in any way, and I get this too.
Traceback (most recent call last):
File "/home/username/Apps/RedditImageGrab/redditdownload.py", line 268, in <module>
URLS = extract_urls(ITEM['url'])
File "/home/usernameApps/RedditImageGrab/redditdownload.py", line 197, in extract_urls
urls = process_deviant_url(url)
File "/home/username/Apps/RedditImageGrab/redditdownload.py", line 167, in process_deviant_url
response = urlopen(url)
File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 437, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 475, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 558, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden
I got it from running:
python2 redditdownload.py -sfw FractalPorn /home/username/WALLPAPER -score 50
Fair Warning: my version of the script is modified, and these modifications were my first attempt ever at Python, but I this stack trace is similar enough to #10 that it probably affects the original code, too.
I keep getting this error:
I think this is caused when running the code without the
--update
tag and the script reaches the absolute last entry in the sub's list of posts. I think it is specifically 404'ing on the URL of the "next page".Other than that, all the images that I would reasonably expect to have successfully downloaded, seem to be successfully downloading.