Closed bamford closed 10 years ago
I've also fixed a couple more bugs I encountered when importing a folder of PDF files from various sources.
Hey, thanks for the improvements to PDF import and HTTP requests. I'll review things in a day or two. (Mostly I need to review why we used httplib
in the first place.) I recall some incompatibilities in urllib
and urllib2
between python 2.6 and 2.7 that I need to check. Thanks again.
Ah, I only tested on Python 2.7, so I guess this might be an issue if you want to keep supporting 2.6 (maybe a good idea for now.)
Hi. httplib
was being used to perform a direct GET
of the URL and read the Location from the returned headers. I modified getRedirect
(3735b2e) to use urllib2
, while making sure to catch URLErrors: using urllib2.urlopen
to fetch an ApJ PDF redirection when you don't have access to the journal explodes with a 403 forbidden error.
This fix is in PR #33 and replaces @bamford commit 0857258. I agree that the other 2 commits should be pulled in. @bamford could you eventually remove the first commit from your PR to make life easier for @jonathansick ? Cheers.
Thanks for the better-behaved version @RuiPereira!
I don't think I can "remove" the first commit from the pull request, so I'll add a commit to revert 0857258.
There you go. @jonathansick should now be able to pull in PR #31 the PR #33 without any problems.
I found that adsbibtex was timing out on the line
due to
httplib
not using my proxy settings.A bit of Googling and I discovered that the same functionality is provided by
urllib2
, which features proxy auto-detection. It now works fine for me behind a proxy, and I've checked that it still determines the correct mirror (which was the only thing using thegetRedirect
function).