semiautomaticgit / SemiAutomaticClassificationPlugin

https://fromgistors.blogspot.com/p/semi-automatic-classification-plugin.html
Other
136 stars 50 forks source link

access scihub.copernicus.eu/apihub trough proxy fails #167

Closed d-lombardi-a closed 9 months ago

d-lombardi-a commented 3 years ago

Trying to search and download scihub.copernicus.eu/apihub products from behind an http proxy fails.

I found an issue (possibly) in https://github.com/semiautomaticgit/SemiAutomaticClassificationPlugin/blob/4d861573d099c09b0d7ff4950419fdcaa40e4390/core/utils.py#L312-L314 where the calls to cfg.urllibSCP.proxyHandler should be replaced with cfg.urllibSCP.request.proxyHandler.

Modifying the source code, nonetheless, the search and download still fails with qgis giving a2021-03-22T12:41:15 CRITICAL Error [50] : Internet error:. Wiresharking the connection attempt, log shows it's trying to go directly bypassing the proxy: image (source IP is the private ip of my machine and destination IP is the pubblic IP of scihub.copernicus.eu)

Trying to replicate the SCP access&search mechanism with the following custom test script ( where i found useful to add https entry to the proxyHandler's proxies dictionary parameter, leaving unchanged the destination proxy since in my case it's the same http:8080 proxy that manages both http and https requests):

from urllib import request, error

#local variables
sciHubRealm='Sentinels Scientific Data Hub Search'
sciHubTopLevelUrl='https://scihub.copernicus.eu/'
sciHubUser='...'
sciHubPassword='...'
proxyHost='...'
proxyPort='...'
proxyUser='...'
proxyPassword='...'

#Create proxy handler
proxyHandler = request.ProxyHandler({'http': proxyHost + ':' + proxyPort, 'https': proxyHost + ':' + proxyPort})

#Create proxy auth handler
pswMngProxy=request.HTTPPasswordMgrWithDefaultRealm()
pswMngProxy.add_password(None, proxyHost,proxyUser,proxyPassword)
proxy_auth_handler=request.ProxyBasicAuthHandler(pswMngProxy)

#Create remote host http auth handler
pswMng = request.HTTPPasswordMgrWithDefaultRealm()
pswMng.add_password(sciHubRealm,sciHubTopLevelUrl,sciHubUser,sciHubPassword)
passwordHandler = request.HTTPBasicAuthHandler(pswMng)

#Create cookies handler
cookieHandler = request.HTTPCookieProcessor()

#create opener with previous created handlers as parameters
opener=request.build_opener(proxy_auth_handler,proxyHandler,cookieHandler,passwordHandler)

try:
    response=opener.open(request.Request('https://scihub.copernicus.eu/apihub/search?start=0&rows=10&q=*'))
    print(response)
except error.HTTPError as e:
    print(e.code,e.reason,'\n',e.headers)

the access to any http and https web resources (from the custom script) results in http 200 OK responses except for any request made to scihub.copernicus.eu/apihub/search that returns http error 404 not found:

404
Not Found
server: Apache-Coyote/1.1
pragma: no-cache
x-xss-protection: 1; mode=block
x-frame-options: DENY
x-content-type-options: nosniff
content-type: text/html;charset=utf-8
content-language: en
content-length: 1089
vary: Accept-Encoding
date: Mon, 22 Mar 2021 11:10:15 GMT
access-control-allow-credentials: true
access-control-allow-origin: 
connection: close

Surprisingly enough the very same requests made from browsers gives the expected searching results.

Any idea on what's going wrong? Thanks. Further thanks for the superlative plugin you've developed.

semiautomaticgit commented 3 years ago

Thank you, I'm glad that you appreciate the plugin. About the issue, it is quite strange, I'll look into this.

semiautomaticgit commented 9 months ago

I'm closing this because of the new version 8 of SCP. Please reopen it if it is still relevant in the new version. Thank you!