Open epie-godfred opened 9 months ago
I got the same error. This is the cause:
[2023-12-18T10:33:15.402+0000] {operators.py:47} INFO - 2023/12/18 10:33:12 - fake-data-generate-person-record.hpl - ERROR: Unable to run workflow workflowTest. The fake-data-generate-person-record.hpl has an error. The pipeline path config/projects/default/pipelines/fake-data-generate-person-record.hpl is invalid, and will not run successfully.
I think an option in workflow run configuration 'Export linked resources to the server" isn't sent to the hop-server. So, the hop-server finds that pipeline file in the server folder, of course it's not there.
Why doesn't HopWorkflowOperator have workflow run configuration input?
Hello, I think the problem you've encountered is due to the hop server configuration. I've noticed that in the following line:
volumes:
- HOP_CLIENT_HOME_CONFIG :/opt/hop/config
You are using the path /opt/hop/config
, instead try to use the path to the base hop directory /opt/hop
as this plugin uses relative pathing from this directory due to apache hop shenanigans.
This just permits me to mount the config folder of the HOP_CLIENT installed on my server with the HOP_SERVER running on the container so that at each time both have the same files. As you rightly stated this config folder is that which is used by the plugin, as described in the article but as my question hints the issue is why the sub workflows and pipelines always searched under the default project directory and not the provided project_name directory.
I've not tested running sub workflows and pipelines with this plugin. I'm no apache hop expert myself so the idea never occurred to me. Right now this probably a limitation of the plugin, but I'll consider it for a future version. Thank you for your feedback!
Excellent discussion.
I'm here in need of help.
I went the other way.
I made some adaptations so that the Plugin works correctly.
1 - In the configuration of the variables file I made the following modification.
Before
After
I used the full path
and my DAG looked like this:
job = HopWorkflowOperator( dag=dag, task_id='tsk-job-agt-cana-hop', workflow='PLANEJAMENTO/JOB_AGT_CANA.hwf', project_name='hop_repo', environment='hop-repo-prd', log_level= 'Basic' )
To make this work with Docker I used the structure below.
A common network between the two containers
a common volume between the two containers
e por último resolvi questão do hop-config.json
before "projectConfigurations" : [ { "projectName" : "hop_repo", "projectHome" : "/opt/projetos/hop_repo", "configFilename" : "project-config.json" }, { "projectName" : "default", "projectHome" : "config/projects/default", "configFilename" : "project-config.json" }, { "projectName" : "samples", "projectHome" : "config/projects/samples", "configFilename" : "project-config.json" }, { "projectName" : "live_hop", "projectHome" : "/opt/projetos/live_hop", "configFilename" : "project-config.json" }, { "projectName" : "UAG", "projectHome" : "/opt/projetos/UAG", "configFilename" : "project-config.json" } ], "lifecycleEnvironments" : [ { "name" : "hop-repo-prd", "purpose" : "Production", "projectName" : "hop_repo", "configurationFiles" : [ "/opt/projetos/hop_repo/hop-repo-prd-config.json" ] }, { "name" : "hop-repo-dev", "purpose" : "Development", "projectName" : "hop_repo", "configurationFiles" : [ "/opt/projetos/hop_repo/hop-repo-dev-config.json" ] }, { "name" : "hop-live-prd", "purpose" : "Production", "projectName" : "live_hop", "configurationFiles" : [ "/opt/projetos/live_hop/hop-live-prd-config.json" ] }, { "name" : "hop-live-dev", "purpose" : "Development", "projectName" : "live_hop", "configurationFiles" : [ "/opt/projetos/live_hop/hop-live-dev-config.json" ] }, { "name" : "UAG-PRD", "purpose" : "Production", "projectName" : "UAG", "configurationFiles" : [ "/opt/projetos/UAG/UAG-PRD-config.json" ] }, { "name" : "dev", "purpose" : "Development", "projectName" : "live_hop", "configurationFiles" : [ "${PROJECT_HOME}/hop-live-dev-config.json" ] } ],
eu utilizei um comando linux para remover o caminho /opt com o comando sed
after
"projectConfigurations" : [ { "projectName" : "hop_repo", "projectHome" : "projetos/hop_repo", "configFilename" : "project-config.json" }, { "projectName" : "default", "projectHome" : "config/projects/default", "configFilename" : "project-config.json" }, { "projectName" : "samples", "projectHome" : "config/projects/samples", "configFilename" : "project-config.json" }, { "projectName" : "live_hop", "projectHome" : "projetos/live_hop", "configFilename" : "project-config.json" }, { "projectName" : "UAG", "projectHome" : "projetos/UAG", "configFilename" : "project-config.json" } ], "lifecycleEnvironments" : [ { "name" : "hop-repo-prd", "purpose" : "Production", "projectName" : "hop_repo", "configurationFiles" : [ "projetos/hop_repo/hop-repo-prd-config.json" ] }, { "name" : "hop-repo-dev", "purpose" : "Development", "projectName" : "hop_repo", "configurationFiles" : [ "projetos/hop_repo/hop-repo-dev-config.json" ] }, { "name" : "hop-live-prd", "purpose" : "Production", "projectName" : "live_hop", "configurationFiles" : [ "projetos/live_hop/hop-live-prd-config.json" ] }, { "name" : "hop-live-dev", "purpose" : "Development", "projectName" : "live_hop", "configurationFiles" : [ "projetos/live_hop/hop-live-dev-config.json" ] }, { "name" : "UAG-PRD", "purpose" : "Production", "projectName" : "UAG", "configurationFiles" : [ "projetos/UAG/UAG-PRD-config.json" ] }, { "name" : "dev", "purpose" : "Development", "projectName" : "live_hop", "configurationFiles" : [ "${PROJECT_HOME}/hop-live-dev-config.json" ] } ],
I have shell scripts that solve this problem.
tks.
any solution for this?
any solution for this?
For now, you will need to follow the file structure.
Checkout this section on README https://github.com/damavis/airflow-hop-plugin?tab=readme-ov-file#3-hop-directory-structure
Hi, we have been having some issues with the plugin when trying to run the workflows created with project names different from the "default" project which comes with Apache HOP, mostly if this workflows have sub workflows and pipelines. To illustrate the problem I used the provided example workflow and pipeline located here, below are the different configurations:
AIRFLOW
docker-compose.yml
make sure to replace ${HOP_CLENT_HOME} with the path to your HOP_CLIENT HOME.
Airflow - HOP SERVER Connection
DAGS
The dag used is the same as dag provided here simply commented out the first two pipelines
APACHE HOP CLIENT
Installed version 2.6 which can be found here
CONFIG FOLDER DETAILS
hop_config.json
airflow_plugin project folder contents
The contents are exactly those provided here on the repo .
HOP_SERVER
docker-compose.yml
notice that the Apache Airflow containers network is shared with the HOP_SERVER container, thus on the Airflow-HOP SERVER connection, the HOP_SERVER_IP should be Gateway IP of the corresponding shared network between the Airflow Containers and HOP_SERVER container. Also remember to change HOP_CLIENT_HOME_CONFIG to the path to your HOP_CLIENT_HOME config folder.
AIRFLOW TASK LOG
dag_id=airflow_plugin_sample_dag_run_id=manual__2023-12-18T10_33_09.262213+00_00_task_id=work_test_attempt=1 (1).log
As can be noticed the mother workflow starts properly, the issue is the HOP_SERVER starts looking for the sub pipelines in the default project and not the proper project name which in this case is airflow_plugin as specified in the dag file. How can we resolve this issue.