uber-go / cadence-client

Framework for authoring workflows and activities running on top of the Cadence orchestration engine.
https://cadenceworkflow.io
MIT License
339 stars 128 forks source link

internal_worker.go:216 unable to verify if domain exist #1307

Open buddhika-ranasinghe opened 6 months ago

buddhika-ranasinghe commented 6 months ago

Describe the bug I am getting the below error while running the Hello World code shared on the blog post.

Getting below error.

2023-12-27T07:18:13.343+0530    WARN    internal/internal_worker.go:216 unable to verify if domain exist    {"Domain": "test-domain", "TaskList": "test-worker", "WorkerID": "33222@Buddhikas-MacBook-Pro.local@test-worker@23e7e18d-1006-48d6-8e83-143785df851d", "WorkerType": "DecisionWorker", "domain": "test-domain", "error": "code:unavailable message:connection closed"}
go.uber.org/cadence/internal.verifyDomainExist.func1
    /Users/branasinghe/go/pkg/mod/go.uber.org/cadence@v1.2.7/internal/internal_worker.go:216
go.uber.org/cadence/internal/common/backoff.Retry
    /Users/branasinghe/go/pkg/mod/go.uber.org/cadence@v1.2.7/internal/common/backoff/retry.go:101
go.uber.org/cadence/internal.verifyDomainExist
    /Users/branasinghe/go/pkg/mod/go.uber.org/cadence@v1.2.7/internal/internal_worker.go:226
go.uber.org/cadence/internal.(*workflowWorker).Start
    /Users/branasinghe/go/pkg/mod/go.uber.org/cadence@v1.2.7/internal/internal_worker.go:333
go.uber.org/cadence/internal.(*aggregatedWorker).Start
    /Users/branasinghe/go/pkg/mod/go.uber.org/cadence@v1.2.7/internal/internal_worker.go:806
main.startWorker
    /Users/branasinghe/Documents/personal/others/Projects/Cadence/hello-world/main.go:80
main.main
    /Users/branasinghe/Documents/personal/others/Projects/Cadence/hello-world/main.go:29
runtime.main
    /usr/local/go/src/runtime/proc.go:267

To Reproduce Is the issue reproducible?

Steps to reproduce the behavior: Run code below.

package main

import (
    "context"
    "net/http"
    "time"

    "go.uber.org/cadence/.gen/go/cadence/workflowserviceclient"
    "go.uber.org/cadence/activity"
    "go.uber.org/cadence/compatibility"
    "go.uber.org/cadence/worker"
    "go.uber.org/cadence/workflow"

    "github.com/uber-go/tally"
    apiv1 "github.com/uber/cadence-idl/go/proto/api/v1"
    "go.uber.org/yarpc"
    "go.uber.org/yarpc/transport/grpc"
    "go.uber.org/zap"
    "go.uber.org/zap/zapcore"
)

var HostPort = "192.168.1.18:7933"
var Domain = "test-domain"
var TaskListName = "test-worker"
var ClientName = "test-worker"
var CadenceService = "cadence-frontend"

func main() {
    startWorker(buildLogger(), buildCadenceClient())
    http.ListenAndServe(":8080", nil)
}

func buildLogger() *zap.Logger {
    config := zap.NewDevelopmentConfig()
    config.Level.SetLevel(zapcore.InfoLevel)

    var err error
    logger, err := config.Build()
    if err != nil {
        panic("Failed to setup logger")
    }

    return logger
}

func buildCadenceClient() workflowserviceclient.Interface {
    dispatcher := yarpc.NewDispatcher(yarpc.Config{
        Name: ClientName,
        Outbounds: yarpc.Outbounds{
            CadenceService: {Unary: grpc.NewTransport().NewSingleOutbound(HostPort)},
        },
    })
    if err := dispatcher.Start(); err != nil {
        panic("Failed to start dispatcher")
    }

    clientConfig := dispatcher.ClientConfig(CadenceService)

    return compatibility.NewThrift2ProtoAdapter(
        apiv1.NewDomainAPIYARPCClient(clientConfig),
        apiv1.NewWorkflowAPIYARPCClient(clientConfig),
        apiv1.NewWorkerAPIYARPCClient(clientConfig),
        apiv1.NewVisibilityAPIYARPCClient(clientConfig),
    )
}

func startWorker(logger *zap.Logger, service workflowserviceclient.Interface) {
    // TaskListName identifies set of client workflows, activities, and workers.
    // It could be your group or client or application name.
    workerOptions := worker.Options{
        Logger:       logger,
        MetricsScope: tally.NewTestScope(TaskListName, map[string]string{}),
    }

    worker := worker.New(
        service,
        Domain,
        TaskListName,
        workerOptions)
    err := worker.Start()
    if err != nil {
        panic("Failed to start worker")
    }

    logger.Info("Started Worker.", zap.String("worker", TaskListName))
}

func helloWorldWorkflow(ctx workflow.Context, name string) error {
    ao := workflow.ActivityOptions{
        ScheduleToStartTimeout: time.Minute,
        StartToCloseTimeout:    time.Minute,
        HeartbeatTimeout:       time.Second * 20,
    }
    ctx = workflow.WithActivityOptions(ctx, ao)

    logger := workflow.GetLogger(ctx)
    logger.Info("helloworld workflow started")
    var helloworldResult string
    err := workflow.ExecuteActivity(ctx, helloWorldActivity, name).Get(ctx, &helloworldResult)
    if err != nil {
        logger.Error("Activity failed.", zap.Error(err))
        return err
    }

    logger.Info("Workflow completed.", zap.String("Result", helloworldResult))

    return nil
}

func helloWorldActivity(ctx context.Context, name string) (string, error) {
    logger := activity.GetLogger(ctx)
    logger.Info("helloworld activity started")
    return "Hello " + name + "!", nil
}

func init() {
    workflow.Register(helloWorldWorkflow)
    activity.Register(helloWorldActivity)
}

Expected behavior Expectation is to start worker Started Worker. {"worker": "test-worker"}

Screenshots If applicable, add screenshots to help explain your problem.

Additional context I am not sure if this is because I am running this worker on a remote node and my backend is running on a different node. But I can check the domain existance with below command

% ../tools/cadence/cadence --address 192.168.1.18:7933 --domain test-domain domain describe
Name: test-domain
UUID: 48dd5446-042e-40ca-9b13-910ad44c23ef
Description: 
OwnerEmail: 
DomainData: map[]
Status: REGISTERED
RetentionInDays: 1
EmitMetrics: true
IsGlobal(XDC)Domain: true
ActiveClusterName: cluster0
Clusters: [cluster0]
HistoryArchivalStatus: DISABLED
VisibilityArchivalStatus: DISABLED
mantas-sidlauskas commented 6 months ago

TL;DR; Replace 192.168.1.18:7933 with 192.168.1.18:7833 in your code.

In your code, you are using thrift2proto layer which is a helper to communicate with cadence server using proto/grpc pair, but destination port is the default TChannel port in development.yaml, default gRPC port is set to 7833

Cadence CLI by default is using TChannel transport and your address matches your configuration, that's why cadence CLI command succeeds.

❯ ./cadence -h
<...>
   --transport value, -t value              optional argument for transport protocol format, either 'grpc' or 'tchannel'. Defaults to tchannel if not provided [$CADENCE_CLI_TRANSPORT_PROTOCOL]
<...>