spatial-data-lab / knime-geospatial-extension

This repository is built for KNIME-CGA Geospatial Project, and the goal is to build Python-based nodes for geospatial analysis in KNIME Analytic Platform.
MIT License
25 stars 10 forks source link

Check and improve proxy support #336

Open koettert opened 7 months ago

koettert commented 7 months ago

The goal of this ticket is to ensure that all geospatial nodes use the newly available proxy settings. With AP-20960 that was released with KNIME AP 5.2 the Python framework sets the proxy information configured via the HTTP_PROXY and HTTPS_PROXY environment variables (via os.setenv(...) os.environ). In addition the proxy settings can be retrieved via knext.get_proxy_settings()

For debugging and testing we can use https://mitmproxy.org/ and Ask Adrian for help.

Steps to do:

  1. Install and setup https://mitmproxy.org/ or any other tool to debug proxy usage
  2. Setup the proxy in KNIME AP (needs to be version 5.2 or newer) by opening Preferences->General->Network Connections and then editing the http and https entries
  3. Restart KNIME AP
  4. Execute different Geospatial nodes especially the OSM nodes to see if they use the proxy settings or ignore it
koettert commented 6 months ago

Possible solution:

wybert commented 5 months ago

I just did a test on Mac M2 with KNIME 5.2.3, below is what I found,

  1. Our extension will not use the proxies set in KNIME preferences.
  2. The Python Script node will not use the proxies set in KNIME preferences.
  3. To enable Python Script node and one of our extension node to use proxies set in KNIME preferences, we need add some code like in the following,
    import os
    os.environ['HTTP_PROXY'] = 'http://localhost:8080'
    os.environ['HTTPS_PROXY'] = 'http://localhost:8080'
  4. It will work for most of the code, but not for the OSMNx package, I find an issue here, but I can't get it work.
  5. Dose os.setenv(...) not work but os.environ dose?
  6. The OSMnx also have a setting module with requests_kwargs, which could specify the proxies, but I can't get it work too.
koettert commented 5 months ago

Once os.environ was set all nodes that use the request package used the proxy. But the OSMnx nodes still didn't work even so the documentation states that it uses the request package that can be configure via _kwargs.

@koettert to check with the Python team about: "Does os.setenv(...) not work but os.environ dose?"

@wybert please check out the above mentioned solutions and try to understand from reading their code and the OSMnx code why setting the proxy via os.environ doesn't work

wybert commented 5 months ago

I just did a test on Mac M2 with KNIME 5.2.3, below is what I found,

  1. Our extension will not use the proxies set in KNIME preferences.
  2. The Python Script node will not use the proxies set in KNIME preferences.
  3. To enable Python Script node and one of our extension node to use proxies set in KNIME preferences, we need add some code like in the following,
import os
os.environ['HTTP_PROXY'] = 'http://localhost:8080'
os.environ['HTTPS_PROXY'] = 'http://localhost:8080'
  1. It will work for most of the code, but not for the OSMNx package, I find an issue here, but I can't get it work.
  2. Dose os.setenv(...) not work but os.environ dose?
  3. The OSMnx also have a setting module with requests_kwargs, which could specify the proxies, but I can't get it work too.

@koettert There is a mistake when testing with OSMNx, OSMNx will use cache for executed query. That's why OSMNx seems never use proxy during my test (It actually doesn't make a request at all). After setting ox.settings.use_cache = False, OSMNx will use proxies both set by

import os
os.environ['HTTP_PROXY'] = 'http://localhost:8080'
os.environ['HTTPS_PROXY'] = 'http://localhost:8080'

Or

proxies = {
    "http": "http://localhost:8080",
    "https": "http://localhost:8080",
}

ox.settings.requests_kwargs = {'proxies': proxies, 'verify': False}