argoproj / argo-events

Event-driven Automation Framework for Kubernetes
https://argoproj.github.io/argo-events/
Apache License 2.0
2.38k stars 740 forks source link

EventHub event-source fails while gathering the hub runtime information #874

Open rgbusato opened 4 years ago

rgbusato commented 4 years ago

Describe the bug I am trying to use the Azure EventHub event-source but cannot seem to get it to work. I've used some of the examples in the repo and have tried many different configurations on the Azure side as well with no luck. Based on my initial debugging it seems like the connection does take place however the event-source fails in the "gathering the hub runtime information..." step. The reason why i say that is because I can see the connections open metric in Azure and a new connection shows up that matches the time window from the event-source logs. So I don't think it's a problem with the authentication itself, it appears to do that successfully.

To Reproduce Steps to reproduce the behavior:

  1. Create an Azure Event Hub
  2. Gather SAS connection string information
  3. Using EventHub connection string, create 2 kubernetes secrets: shared-access-key-name-secret and shared-access-key-secret.
  4. Create EventSource using the example in the argo-events repo and referencing the kubernetes secrets previously created:
    apiVersion: argoproj.io/v1alpha1
    kind: EventSource
    metadata:
    name: azure-events-hub
    spec:
    azureEventsHub:
    example:
      # FQDN of the EventsHub namespace you created
      # More info at https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-get-connection-string
      fqdn: my-eventhub-namespace.servicebus.windows.net
      sharedAccessKeyName:
        name: shared-access-key-name-secret
        key: value
      sharedAccessKey:
        name: shared-access-key-secret
        key: value
      # Event Hub path/name
      hubName: my-eventhub-01
  5. Error from event-source logs
    {"eventName":"example","eventSourceName":"azure-events-hub","eventSourceType":"azureEventsHub","level":"info","msg":"gathering the hub runtime information...","time":"2020-09-03 20:18:29"}
    {"error":"failed to get the hub runtime information for example: server error link c669c591-026f-4ca6-b23d-45525508bfa0: status code 500 and description: The service was unable to process the request; please retry the operation. For more information on exception types and proper exception handling, please refer to http://go.microsoft.com/fwlink/?LinkId=761101","eventName":"example","eventSourceName":"azure-events-hub","level":"error","msg":"failed to start service.","time":"2020-09-03 20:18:32"}

Expected behavior I would expect the EventHub event-source to initialize correctly without any errors and start listening for new events. When a new event is detected by the event-source it places in the eventbus for further processing.

Environment (please complete the following information):

Additional context I've tested argo events with the webhook as well as the kafka event-source successfully and was able to listen and trigger argo workflows as expected. Based on my initial debugging it seems like the issue is coming from the EventHub Go sdk that argo-events uses. I've noticed that we are not version locking the go SDK for EventHub. I'm wondering if maybe things have changed in that library that broke this argo-events functionality.


Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

rgbusato commented 4 years ago

@VaibhavPage do you think this is an Azure issue? I can help creating a support ticket with them if you think that is the problem. If you think it's with the EventHub library i can also help creating an issue there as well to get this moving.

rgbusato commented 4 years ago

I've noticed that someone recently asked to reopen a related issue https://github.com/argoproj/argo-events/issues/710

rgbusato commented 4 years ago

anyway I can turn on additional debug logs to get more details on the exact issue to see if it's an argo-events vs azure problem?

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any activity in the last 60 days. It will be closed if no further activity occurs. Thank you for your contributions.

rgbusato commented 3 years ago

Is this fixed now ?

whynowy commented 3 years ago

I will set up an Azure Event Hub and look into this.

whynowy commented 3 years ago

@rgbusato - I set up an Event Hub and tested with v1.3.0, it worked successfully. Would you like to try it again?

sky29 commented 3 years ago

I am using argo-events - v1.3.0 & trying to integrate azure-event-hub.

but my eventsource pod is crashing with below error: "error":"open /tmp/graft.log: no such file or directory"

Logs:

{"level":"info","ts":1619095607.6588032,"logger":"argo-events.eventsource","caller":"cmd/main.go:63","msg":"starting eventsource server","eventSourceName":"azure-events-hub","version":"vv1.3.0+7591146.dirty"} {"level":"info","ts":1619095607.6589344,"logger":"argo-events.eventsource","caller":"metrics/metrics.go:172","msg":"starting metrics server","eventSourceName":"azure-events-hub"} {"level":"fatal","ts":1619095607.6648698,"logger":"argo-events.eventsource","caller":"leaderelection/leaderelection.go:111","msg":"failed to new a node","eventSourceName":"azure-events-hub","error":"open /tmp/graft.log: no such file or directory","stacktrace":"github.com/argoproj/argo-events/common/leaderelection.(natsEventBusElector).RunOrDie\n\t/home/runner/work/argo-events/argo-events/common/leaderelection/leaderelection.go:111\ngithub.com/argoproj/argo-events/eventsources.(EventSourceAdaptor).Start\n\t/home/runner/work/argo-events/argo-events/eventsources/eventing.go:293\nmain.main\n\t/home/runner/work/argo-events/argo-events/eventsources/cmd/main.go:65\nruntime.main\n\t/opt/hostedtoolcache/go/1.14.15/x64/src/runtime/proc.go:203"}

basically it is breaking at this stage: https://github.com/argoproj/argo-events/blob/master/common/leaderelection/leaderelection.go

any idea, what is causing this error ?

whynowy commented 3 years ago

@sky29 - did you install v1.3.0 by upgrading, or started from fresh?

Can you check if the EventSource Pod has a emptyDir volume mounted to /tmp?

sky29 commented 3 years ago

@whynowy

I have upgraded argo events from from v1.2.2 to v1.3.0. EventSource Pod was not having emptyDir volume (mounted to /tmp). So I manually edited (or patched) EventSource (K8S) Deployment artifact.

Finally I am able to run it now.

Here are few things what I did:

=========

Last step is still manual for me ..... so not sure if it is upgrade issue or if any fix is required to commit (to add emptyDir at /tmp) ???

sky29 commented 3 years ago

@whynowy

On the similar note, I have a query:

This is what am I trying to do:

Goal: upload a file in azure-blob-storage & it should trigger an argo-workflow

My current solution looks like this: azure-blob-storage (file upload) --> azure-blob-storage event trigger (to azure-event-hub) --> azure-event-hub --> argo-event (with azure-event-hub EventSource) --> argo-workflow

My query is: Argo Events supports "Webhook EventSource" & Azure-blob-storage can also trigger "Webhook Events".

So can we shorter the trip here (through CloudEvents Schema or EventGrid Schema) like: azure-blob-storage (file upload) --> azure-blob-storage event trigger (webhook trigger) --> argo-event (with webhook EventSource) --> argo-workflow

So basically I don't want to use "azure-event-hub" in the middle.

I tried to run this, but it was throwing "Handshake Validation error". I can see Microsoft documented one similar scenario with azure function, where azure function has to respond with shared validationCode. https://docs.microsoft.com/en-us/azure/event-grid/receive-events

But I am not sure if it will work with Argo Events (with Webhook EventSource) ? any idea ???

whynowy commented 3 years ago

If azure-blob-storage event trigger (webhook trigger) does not require any specific response but only 200 OK, then the webhook event source should work.