aristanetworks / goarista

Fairly general building blocks used in Arista Go code and open-sourced for the benefit of all.
Apache License 2.0
213 stars 68 forks source link

cmd/ockafka doesn't allow to subscribe to specific keys #23

Open tamihiro opened 6 years ago

tamihiro commented 6 years ago

I tried it out, but kafka producer didn't publish a message. Without looking deep into the cause of the glitch, I thought I'd better use newish gnmi code like those for other storage (ocredis, ocsplunk, etc.). So I did, and it looks working. Please review my PR.

7AC commented 6 years ago

Can you elaborate on what doesn't work? We haven't dropped the older protobuf support in OpenConfig or TerminAttr yet hence we haven't updated all the tools yet.

tamihiro commented 6 years ago

Apologies for being unclear, and also saying

kafka producer didn't publish a message.

was misleading. With the current ockafka code in your master branch, Arista switch (vEOS 4.20.1F) establishes connection with the kafka producer, but doesn't send messages back to it at all.

$ ockafka -addrs 192.168.12.1 -kafkaaddrs 192.168.11.13:32768,192.168.11.13:32769 -username admin -password xyz -subscribe /interfaces/interface[name=Ethernet1]/state/counters,/interfaces/interface[name=Ethernet1]/config
I0928 12:47:12.907581   10559 main.go:34] Connected to Kafka brokers at [192.168.11.13:32768 192.168.11.13:32769]
I0928 12:47:12.907640   10559 main.go:56] Initialized Kafka producer for 192.168.12.1
I0928 12:47:12.909059   10559 client.go:46] Connected to 192.168.12.1:6030
I0928 12:47:12.909180   10559 client.go:107] Sending subscribe request: subscribe:<subscription:<path:<element:"" element:"interfaces" element:"interface[name=Ethernet1]" element:"state" element:"counters" > > > 
I0928 12:47:12.909472   10559 client.go:107] Sending subscribe request: subscribe:<subscription:<path:<element:"" element:"interfaces" element:"interface[name=Ethernet1]" element:"config" > > > 

On the switch (192.168.12.1) I issue commands to transmit packets from Et1 or shut down the interface,

vEOS1#show proc | inc OpenConfig
10879  0.1  2.1 ?        Sl   12:14:05 00:00:04 /usr/bin/OpenConfig -grpcaddr=192.168.12.1:6030
15559  0.0  0.0 ?        Ss   12:56:36 00:00:00 grep -aEe OpenConfig
vEOS1#show log
...
Sep 28 12:49:36 vEOS1 ConfigAgent: %SYS-5-CONFIG_E: Enter configuration mode from console by admin on vty3 (192.168.12.249)
Sep 28 12:49:39 vEOS1 Ebra: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet1, changed state to down
Sep 28 12:49:51 vEOS1 Ebra: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet1, changed state to up
Sep 28 12:49:52 vEOS1 ConfigAgent: %SYS-5-CONFIG_I: Configured from console by admin on vty3 (192.168.12.249)

but the switch sends no telemetry messages from port 6030 to the kafka producer, hence no kafka messages from producer to brokers.

With my revision, the switch throws telemetry messages from port 6030 to the producer, which in turn sends kafka messages to brokers.

I0928 14:13:02.856462    6584 encoder.go:74] kafka: {"path":"/","timestamp":1538111582845,"updates":{"/interfaces/interface[name=Ethernet1]/state/counters/out-octets":"6360047"}}
I0928 14:13:02.856498    6584 producer.go:119] Message produced to Kafka: &{ockafka 192.168.12.1 [123 34 112 97 116 104 34 58 34 47 34 44 34 116 105 109 101 115 116 97 109 112 34 58 49 53 51 56 49 49 49 53 56 50 56 52 53 44 34 117 112 100 97 116 101 115 34 58 123 34 47 105 110 116 101 114 102 97 99 101 115 47 105 110 116 101 114 102 97 99 101 91 110 97 109 101 61 69 116 104 101 114 110 101 116 49 93 47 115 116 97 116 101 47 99 111 117 110 116 101 114 115 47 111 117 116 45 111 99 116 101 116 115 34 58 34 54 51 54 48 48 52 55 34 125 125] [] {2018-09-28 14:13:02.845216743 +0900 JST 1} 0 0 0001-01-01 00:00:00 +0000 UTC 0 0 <nil>}
I0928 14:13:02.856587    6584 encoder.go:74] kafka: {"path":"/","timestamp":1538111582845,"updates":{"/interfaces/interface[name=Ethernet1]/state/counters/out-unicast-pkts":"29168"}}
I0928 14:13:02.856597    6584 producer.go:119] Message produced to Kafka: &{ockafka 192.168.12.1 [123 34 112 97 116 104 34 58 34 47 34 44 34 116 105 109 101 115 116 97 109 112 34 58 49 53 51 56 49 49 49 53 56 50 56 52 53 44 34 117 112 100 97 116 101 115 34 58 123 34 47 105 110 116 101 114 102 97 99 101 115 47 105 110 116 101 114 102 97 99 101 91 110 97 109 101 61 69 116 104 101 114 110 101 116 49 93 47 115 116 97 116 101 47 99 111 117 110 116 101 114 115 47 111 117 116 45 117 110 105 99 97 115 116 45 112 107 116 115 34 58 34 50 57 49 54 56 34 125 125] [] {2018-09-28 14:13:02.845235652 +0900 JST 1} 0 0 0001-01-01 00:00:00 +0000 UTC 0 0 <nil>}
I0928 14:13:02.867572    6584 encoder.go:74] kafka: {"path":"/","timestamp":1538111582856,"updates":{"/interfaces/interface[name=Ethernet1]/state/counters/in-unicast-pkts":"29169"}}
I0928 14:13:02.867589    6584 producer.go:119] Message produced to Kafka: &{ockafka 192.168.12.1 [123 34 112 97 116 104 34 58 34 47 34 44 34 116 105 109 101 115 116 97 109 112 34 58 49 53 51 56 49 49 49 53 56 50 56 53 54 44 34 117 112 100 97 116 101 115 34 58 123 34 47 105 110 116 101 114 102 97 99 101 115 47 105 110 116 101 114 102 97 99 101 91 110 97 109 101 61 69 116 104 101 114 110 101 116 49 93 47 115 116 97 116 101 47 99 111 117 110 116 101 114 115 47 105 110 45 117 110 105 99 97 115 116 45 112 107 116 115 34 58 34 50 57 49 54 57 34 125 125] [] {2018-09-28 14:13:02.856504306 +0900 JST 1} 0 0 0001-01-01 00:00:00 +0000 UTC 0 0 <nil>}
I0928 14:13:02.867656    6584 encoder.go:74] kafka: {"path":"/","timestamp":1538111582856,"updates":{"/interfaces/interface[name=Ethernet1]/state/counters/in-octets":"6360848"}}
I0928 14:13:02.867670    6584 producer.go:119] Message produced to Kafka: &{ockafka 192.168.12.1 [123 34 112 97 116 104 34 58 34 47 34 44 34 116 105 109 101 115 116 97 109 112 34 58 49 53 51 56 49 49 49 53 56 50 56 53 54 44 34 117 112 100 97 116 101 115 34 58 123 34 47 105 110 116 101 114 102 97 99 101 115 47 105 110 116 101 114 102 97 99 101 91 110 97 109 101 61 69 116 104 101 114 110 101 116 49 93 47 115 116 97 116 101 47 99 111 117 110 116 101 114 115 47 105 110 45 111 99 116 101 116 115 34 58 34 54 51 54 48 56 52 56 34 125 125] [] {2018-09-28 14:13:02.856522943 +0900 JST 1} 0 0 0001-01-01 00:00:00 +0000 UTC 0 0 <nil>}

WRT your comment:

We haven't dropped the older protobuf support in OpenConfig or TerminAttr yet hence we haven't updated all the tools yet.

But you guys not only deprecated occli in favor of gnmi a year ago, but have also updated other commands such as ocprometheus, ocredis, and ocsplunk just recently. Why not updating ockafka as well?

7AC commented 6 years ago

With the current ockafka code in your master branch, Arista switch (vEOS 4.20.1F) establishes connection with the kafka producer, but doesn't send messages back to it at all.

I think the issue is not between ockafka and Kafka, but between ockafka and OpenConfig. It looks like you're hitting a bug that affects the server for the old protobuf, the workaround should be to pop up one level and subscribe to all keys (i.e., not specify a key). Can you try that?

But you guys not only deprecated occli in favor of gnmi a year ago, but have also updated other commands such as ocprometheus, ocredis, and ocsplunk just recently. Why not updating ockafka as well?

We're planning to, just didn't get to it yet. Thank you for your contribution for it, I left you 2 comments on the PR.

tamihiro commented 6 years ago

Yes I meant to say the issue was between ockafka and OpenConfig, and your observation is correct. I tried the following and it worked.

ockafka -addrs 192.168.12.1 -kafkaaddrs 192.168.11.13:32768,192.168.11.13:32769 -username admin -password xyz -subscribe /interfaces

We're planning to, just didn't get to it yet. Thank you for your contribution for it, I left you 2 comments on the PR.

As for your two comments... I saw you had already commited a77bf0c, fb622b9, and ff33da2, with these changes they have all dropped the ability to connect multiple servers. So in my view, having a working gNMI collector for kafka, which is free from the aforementioned protobuf bug, is one valid step, and regaining multiple server connection capability for all telemetry destination adapters by making changes to gnmi client code should be the next step.

I'd appreciate your opinion.

7AC commented 6 years ago

Hi @tamihiro, thank you for doing the migration. Our recommendation is usually to run these tools directly on the switch so we're probably ok losing the -addrs option. We don't have the infrastructure to accept external PRs yet, so I'll port your change internally once it's passing and point to this. Thanks again!

tamihiro commented 6 years ago

@7AC thanks so much for the review! It doesn't matter at all that my PR is not directly accepted, I appreciate you working on it internally.

Gelob commented 6 years ago

I run octsdb on a linux server to get streaming data from the switch running TerminAttr so that I can pass it back directly to InfluxDB via their opentsdb input plugin. Since https://github.com/aristanetworks/goarista/commit/fb622b9b46608fdb39d36447f4d8ef52fe37fc3d#diff-0a86d997d2c66282031326a80d166c4e I can't run 1 single octsdb instance against multiple switches.

Example: -addrs switch1:6042,switch2:6042. -addr doesn't seem to support multiple comma separated values. Which means I need to run a separate octsdb instance for each switch

7AC commented 6 years ago

Hi @Gelob, have you considered running the tool on the device itself? That's usually what we recommend. If you really want multiple addresses feel free to open a separate issue, I will use this one to track migrating ockafka to gNMI.

wladh commented 6 years ago

@Gelob please open another issue, so we can track demand for that, and then we can consider bringing that feature back.

tamihiro commented 6 years ago

Hi @7AC I figured that you guys decided to work on my PR internally a while back. May I ask how it's progressing? Please let me know if you want me to do anything on my side again.

7AC commented 6 years ago

@tamihiro sorry for the delay. The fastest route is definitely you taking addressing the comments on the review and getting CI to pass, then we can merge internally. Second fastest route is if you reach out to your SE and ask for alternatives. We will make the switch to gNMI but I can't guarantee it will be prioritized until we remove support for the old protobuf.

tamihiro commented 6 years ago

@7AC as per your suggestion I have updated producer_test.go and done some more fixes to get travis CI to pass. I don't get why you changed the title of this issue but anyway, please let me know if there's still anything else I need to do before it gets merged.

7AC commented 6 years ago

Thank you for the update, I left you another comment. I renamed the issue because "doesn't work as is" is too broad IMHO, it just doesn't work in the case where you want to retrieve a specific key as opposed to the entire collection.

tamihiro commented 6 years ago

Is your another comment this?

I think we can keep the interface generic the way it was intended, so we don't couple ourselves with the types of messages we're handling.

I'm afraid I don't understand.. could you please elaborate?

I now understand the change in the title of the issue, thanks. As a network operator, I'm seeing the current code, due to the bug you previously mentioned that doesn't allow specifying keys for which the user only needs the values, is IMHO not really usable.

7AC commented 6 years ago

Is your another comment [this]

Sorry I didn't realize I didn't submit the comment, done now it was this one I was referring to:

https://github.com/aristanetworks/goarista/pull/22#discussion_r234832013

I think we can keep the interface generic the way it was intended, so we don't couple ourselves with the types of messages we're handling.

I'm afraid I don't understand.. could you please elaborate?

The interface purposely has proto.Message as opposed to any specific implementation of it (e.g., gnmi) so it doesn't depend on the transport. I'd like to keep it that way.

I now understand the change in the title of the issue, thanks. As a network operator, I'm seeing the current code, due to the bug you previously mentioned that doesn't allow specifying keys for which the user only needs the values, is IMHO not really usable.

I agree there's an issue, just not as broad as "the tool doesn't work". There's a specific feature that's broken and there is a workaround (https://github.com/aristanetworks/goarista/issues/23#issuecomment-425542300), I'm afraid other users could read "doesn't work as is" and think it doesn't start or doesn't get any data.

tamihiro commented 6 years ago

@7AC I'm a little puzzled with your last comment, and admittedly I'm not even rermotely an expert on any of these protocols... pls check my response, and if you could tell me what I'm missing I'd appreciate it.