Netflix / conductor

Conductor is a microservices orchestration engine.
Apache License 2.0
12.82k stars 2.34k forks source link

Workflow not going forward after DO_WHILE execution completion #3880

Open appunni-old opened 10 months ago

appunni-old commented 10 months ago

Describe the bug

Details Conductor version: 3.15.0 Persistence implementation: MySQL Queue implementation: MySql Lock: Redis Workflow definition:

{
  "createTime": 1701669713675,
  "updateTime": 1701669746737,
  "createdBy": "owner@email.com",
  "updatedBy": "owner@email.com",
  "accessPolicy": {},
  "name": "test_do_while",
  "description": "Workflow details",
  "version": 2,
  "tasks": [
    {
      "name": "default_do_while",
      "taskReferenceName": "task_1__loop_databricks",
      "inputParameters": {},
      "type": "DO_WHILE",
      "startDelay": 0,
      "optional": false,
      "asyncComplete": false,
      "loopCondition": "if ($.task_1__loop_databricks['iteration'] < 1) { true; } else { false; }",
      "loopOver": [
        {
          "name": "default_sleep",
          "taskReferenceName": "task_1__wait_databricks",
          "inputParameters": {
            "duration": "20 seconds",
            "tenantId": "csit"
          },
          "type": "WAIT",
          "startDelay": 0,
          "optional": false,
          "asyncComplete": false
        }
      ]
    },
    {
      "name": "default_sleep",
      "taskReferenceName": "task_2__wait_databricks",
      "inputParameters": {
        "duration": "20 seconds",
        "tenantId": "csit"
      },
      "type": "WAIT",
      "startDelay": 0,
      "optional": false,
      "asyncComplete": false
    }
  ],
  "inputParameters": [],
  "outputParameters": {},
  "schemaVersion": 2,
  "restartable": true,
  "workflowStatusListenerEnabled": false,
  "ownerEmail": "owner@email.com",
  "timeoutPolicy": "ALERT_ONLY",
  "timeoutSeconds": 0,
  "variables": {},
  "inputTemplate": {}
}

Task definition: System Event handler definition:

To Reproduce Steps to reproduce the behavior:

-> Execute the above workflow -> Wait for 20 seconds for the WAIT task completion

Expected behavior WAIT task outside the DO_WHILE was supposed to execute

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Testing was sone after merging #3878

appunni-old commented 10 months ago

This may be related to #3876 . Again Double underscore may be causing issue

appunni-old commented 10 months ago
{
  "createTime": 1701670338519,
  "updateTime": 1701669746737,
  "createdBy": "owner@email.com",
  "updatedBy": "owner@email.com",
  "accessPolicy": {},
  "name": "test_do_while_2",
  "description": "Workflow details",
  "version": 1,
  "tasks": [
    {
      "name": "default_do_while",
      "taskReferenceName": "task_1_loop_databricks",
      "inputParameters": {},
      "type": "DO_WHILE",
      "startDelay": 0,
      "optional": false,
      "asyncComplete": false,
      "loopCondition": "if ($.task_1_loop_databricks['iteration'] < 1) { true; } else { false; }",
      "loopOver": [
        {
          "name": "default_sleep",
          "taskReferenceName": "task_1_wait_databricks",
          "inputParameters": {
            "duration": "20 seconds",
            "tenantId": "csit"
          },
          "type": "WAIT",
          "startDelay": 0,
          "optional": false,
          "asyncComplete": false
        }
      ]
    },
    {
      "name": "default_sleep",
      "taskReferenceName": "task_2_wait_databricks",
      "inputParameters": {
        "duration": "20 seconds",
        "tenantId": "csit"
      },
      "type": "WAIT",
      "startDelay": 0,
      "optional": false,
      "asyncComplete": false
    }
  ],
  "inputParameters": [],
  "outputParameters": {},
  "schemaVersion": 2,
  "restartable": true,
  "workflowStatusListenerEnabled": false,
  "ownerEmail": "owner@email.com",
  "timeoutPolicy": "ALERT_ONLY",
  "timeoutSeconds": 0,
  "variables": {},
  "inputTemplate": {}
}

using this worked.

appunni-old commented 10 months ago
            List<TaskModel> loopOverTaskList =
                    workflow.getTasks().stream()
                            .filter(
                                    t ->
                                            TaskUtils.removeIterationFromTaskRefName(
                                                            t.getReferenceTaskName())
                                                    .equals(taskRefName))
                            .collect(Collectors.toList());

This could be the bug here, as they are using removeIterationFromTaskRefName for checking if it's an iterable task. But the method itself is too fragile implementation.

SimpleActionProcessor:132

gajendrangnanasekaran commented 9 months ago

HI can you help me to do setup conductor with postgres

appunni-old commented 9 months ago

Hi @gajendrangnanasekaran you can use docker-compose to set it up . Clone this repo and run

docker-compose -f docker/docker-compose-postgres.yaml up -d

To kill run

docker-compose -f docker/docker-compose-postgres.yaml kill

If you want to run local setup for development, remove conductor:server from docker-compose file and create a new docker-compose file and run that docker-compose file using the similar command.

Then start up local server by running below command CONDUCTOR_CONFIG_FILE=docker/server/config/config-postgres-modified.properties ./gradlew bootRun

config-postgres-modified.properties must be pointing to localhost:6432 and localhost:9201 as postgres and elastic search

if you want to run the local server on intellij here is how it looks like The second line below jdk selection is the VM options

Screenshot 2023-12-05 at 9 55 10 PM

for mysql support I have added // community runtimeOnly("com.netflix.conductor:conductor-mysql-persistence:${revConductor}")

in the server/build.gradle dependencies. That's all