apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.01k stars 14.27k forks source link

SSH/SFTP over socks proxy - host_proxy_cmd as extra param #43636

Open mstangler opened 2 days ago

mstangler commented 2 days ago

Description

Now i evaluate and PoC AirFlow.

Please consider make "host_proxy_cmd" as one of extra parameter and use current behavior as fallback, if not set.

I'm new in python/AirFlow world, so there is possibility i miss something :) - be gentle

Use case/motivation

It is mandatory for me to use SFTP over socks proxy. In current tool, (which does not support proxy either) i use bash workaround -o ProxyCommand='ncat --proxy-auth proxy_user:**** --proxy proxy_host:port %h %p'.

sftp -i 'private.key' -o BatchMode=yes -P 22 -o ProxyCommand='ncat --proxy-auth proxy_user:**** --proxy proxy_host:port %h %p' 'sftp_user@sftp_host:'

As far as i dive into documentation and provider source (SFTPHook, SSHHook), there is no externalized parameter for proxy command config, but it sets internally proxy command from ssh config file "~/.ssh/config" (as "proxycommand") and then forward into encapsulated Paramiko object using paramiko.ProxyCommand(). This approach is not convenient for me, i rather have all the configuration on the same location for simple CI, avoid underlying OS artifacts as much as possible

Related issues

No response

Are you willing to submit a PR?

Code of Conduct

boring-cyborg[bot] commented 2 days ago

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

potiuk commented 2 days ago

I marked it as good first issue. Hopefully someone will pick it up and implement, but the fastest way to get is to contribute it - other than that, it will have to wait for a volunteer to implement it.