apache / hop

Hop Orchestration Platform
https://hop.apache.org/
Apache License 2.0
891 stars 328 forks source link

[Bug]: Workflow doesn't even start first action #3765

Open beccon4 opened 3 months ago

beccon4 commented 3 months ago

Apache Hop version?

2.8

Java version?

openjdk 17.0.10 2024-01-16

Operating system

Linux

What happened?

The workflow hangs before it starts its first action. There are no intelligible log entries to further analyse. All that is visible are three entries - telling that the workflow itself started as remote and then repeats its name - and that's it.

2024/03/26 13:17:11 - /home/beccon/DWH_Import_Hop/Workflows/WF_Import_Antraege.hwf : WF_Import_Antraege - Executing this workflow using the Remote Workflow Engine with run configuration 'remote vm-dwh-ua run config'
2024/03/26 13:17:12 - /home/beccon/DWH_Import_Hop/Workflows/WF_Import_Antraege.hwf : WF_Import_Antraege - 2024/03/26 13:17:11 - WF_Import_Antraege - Start of workflow execution
2024/03/26 13:17:12 - /home/beccon/DWH_Import_Hop/Workflows/WF_Import_Antraege.hwf : WF_Import_Antraege - 2024/03/26 13:17:11 - WF_Import_Antraege - WF_Import_Antraege

There are no signs of that the first action item (pipeline) has even tried to start (i.e. no log file). The log entry "Starting action [first_pipeline.hpl]" does not appear.

How can I further analyse what goes wrong here?

Issue Priority

Priority: 1

Issue Component

Component: Hop Gui, Component: Hop Run, Component: Infrastructure, Component: Pipelines, Component: Workflows

hansva commented 3 months ago

what does it say when increasing the log level to debug?

beccon4 commented 3 months ago

I've selected the most verbose log level (row -level - or something - I run the German locale: "Zeilenebene")

It looks like that processing doesn't even get to the point where this can have any effect.

hansva commented 3 months ago

Is export linked resources turned on in the remote run configuration? If that is the case I think we will need a reproduction path to further investigate the issue.

beccon4 commented 3 months ago

Yes, Export Linked Resources is checked. However there is no difference at all whether I run the workflow remotely or locally. It just hangs with no clues.

I created a workflow consisting of quite a few simple pipelines - complex SQL in - output to destination table. It worked fine. Then I added a more complex pipeline (containing sorts and a merge) to that workflow and that workflow started to hang. Even removing that complex pipeline didn't help. Copying the structure to a new empty workflow did the job. I could even add the complex pipeline again and it worked. Comparing the xml of the hwf files doesn't show much difference except that the sequence of the piplines is somewhat scrambled.

What can be the cause?

I run HOP on an KVM virtual machine running Xfce4. The server is across a VPN at the client's premises. It works so far. It always works when I start a pipeline directly. What's wrong here?

hansva commented 3 months ago

Then I think you have found the cause, something must have gone wrong in the workflow file causing it to not load properly. The error must get eaten up somewhere in the code. It would be great if you happen to still have the broken workflow then we can debug and see where it's going wrong and try to avoid it.

beccon4 commented 3 months ago

Back from vacation - sorry for delay. Meanwhile I'm able to reproduce the issue: I configured a scheduled restart in the Start action. When done so, the Workflow doesn't start. Unckecking the scheduled run option again does not help - the Workflow remains broken. Neither restarting the entire computer nor starting the Workflow and letting it sit there for days changes anything. Deleting and recreating the Start action makes a difference - the Workflow works again.

BTW: the Workflow controls as a whole seem a bit flaky: I could not interrupt a running Workflow properly - it kept continuing despite it recognised the interrupt and there is room for improvement with the schedule configuration in the first place (e.g. graying them out or hiding them when not activated ) making it more transparent what has been configured.