Closed anjackson closed 1 year ago
Using an SSHOperator to run the Docker command is likely the best bet. I've created and ingest
user in the docker
group on crawler06
for this purpose, but needs a password/cert setting up and adding as Airflow secrets.
Implemented and working.
This Docker command can be used to get recent Nominet data and upload to HDFS, and should be run in an Airflow DAG monthly:
But this needs to run somewhere that can
SFTP
to the outside world and talk to the H3 HDFS API, so needs to run on e.g. a crawler machine rather than running directly on the Docker Swarms. It's not clear how best to do it. Some options: