Closed benboughton1 closed 8 years ago
I'm also getting a 403 Forbidden since yesterday when trying to log in via Python and wget.
I'm also experiencing this error.
Same error here.
This may be caused by the addition of a CSRF token in the ERS login form. I don't recall that being present before.
Dear all, Thanks for signalling the change in USGS policy. I am on holidays with a poor connexion, I can try that next week, but besides, I am not sure I know how to handle such a token in a python login. Any of you knows ? Best regards, Olivier
Can do. I have a working example for no_proxy so I'll change proxy to what I have but can't test it.
Great ! Thanks a lot ! Please do a pull request when you are ready, i'll try it as soon as possible, and will try to implement the proxy part, at least with CNES's proxy (which is a hard one) Best regards Olivier
Looking at the form html, it appears there are two hidden fields, csrf_token
and __ncforminfo
. I imagine that both of those would have to be supplied on the form POST..
Although I'm using JAVA Apache to get the Landsat files downloaded, I was facing the same problem there. Here is how I got it running again. Might be helpful for you as well:
The __ncforminfo
token is not important, runs even without posting this token. The csrf_token must be read out and submitted again. The important change for me was to send the whole header information again when posting the username and password together with the csrf token. Here is the JAVA code, for completeness:
HttpClientContext context = HttpClientContext.create();
CookieStore cookieStore = new BasicCookieStore();
context.setCookieStore(cookieStore);
CloseableHttpClient client = HttpClientBuilder.create().build();
HttpGet get = new HttpGet("https://ers.cr.usgs.gov/login/");
get.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
HttpResponse response = client.execute(get, context);
Get the information for the csrf token from the response of the GET method.
List<NameValuePair> paramList = new ArrayList<NameValuePair>();
paramList.add(new BasicNameValuePair("username", user));
paramList.add(new BasicNameValuePair("password", pwd));
paramList.add(new BasicNameValuePair("csrf_token", csrf_token));
HttpPost post = new HttpPost("https://ers.cr.usgs.gov/login/");
post.setHeaders(get.getAllHeaders());
UrlEncodedFormEntity urlEncodedFormEntity = new UrlEncodedFormEntity(paramList, "UTF-8");
post.setEntity(urlEncodedFormEntity);
HttpResponse response2 = client.execute(post, context);
This gives me a 302, ready for download the files. Hope this will help. Good luck, Gunther
I did a quick fix in case anyone needs this. I'm sure Olivier will make this much cleaner. I had to pip install BeautifulSoup to parse the html.
If you need this going asap this works for me.
def connect_earthexplorer_no_proxy(usgs):
cookies = urllib2.HTTPCookieProcessor()
opener = urllib2.build_opener(cookies)
urllib2.install_opener(opener)
soup = BeautifulSoup(urllib2.urlopen("https://ers.cr.usgs.gov/login").read())
token = soup.find('input', {'name': 'csrf_token'})
params = urllib.urlencode(dict(username=usgs['account'],password= usgs['passwd'], csrf_token=token['value']))
request = urllib2.Request("https://ers.cr.usgs.gov/login", params, headers={})
f = urllib2.urlopen(request)
data = f.read()
f.close()
if data.find('You must sign in as a registered user to download data or place orders for USGS EROS products')>0 :
print "Authentification failed"
sys.exit(-1)
return
Thanks a lot Mike, It looks much simpler now, and it works !. I did not know this BeautifulSoup library. The only drawback is that we need to install it. Olivier
Here is an alternative using regex (not as robust, but with no external dependency):
import re
...
data = urllib2.urlopen("https://ers.cr.usgs.gov/login").read()
m = re.search(r'<input .*?name="csrf_token".*?value="(.*?)"', data)
if m:
token = m.group(1)
Another possible alternative that doesn't require external dependencies is https://docs.python.org/2/library/htmlparser.html
Nice using regex! I was going to look into that today.
@mkmitchell
I get TypeError: 'module' object is not callable
using your code posted above. any idea how to solve this?
I am testing the suggestion of dswanepoel, which seems to work well. That will enable to avoid the BeautifulSoup (no Soup in summer ;) )
I will push the new version soon. I still need to test with the proxy version Olivier
Done.
Dear Mr. Hagolle, I used the Landsat-8 download script (with a list of products in inputs) a few months ago without problem. Today after one or two downloads, I get this error: "CSRF_Token not found". Is it a limitation on USGS side? Thanks for your help. Christophe
Is anyone else experiencing 'USGS not currently responding to requests'?
When catching the error it is giving me a 403 Forbidden.
My user name and password are working in Earth Explorer web interface when I try download the exact file using the link LANDSAT-Download generates.
I have logged out of Earth Explorer before using this script as well.