Open tamihiro opened 6 years ago
Can you elaborate on what doesn't work? We haven't dropped the older protobuf support in OpenConfig or TerminAttr yet hence we haven't updated all the tools yet.
Apologies for being unclear, and also saying
kafka producer didn't publish a message.
was misleading. With the current ockafka code in your master branch, Arista switch (vEOS 4.20.1F) establishes connection with the kafka producer, but doesn't send messages back to it at all.
$ ockafka -addrs 192.168.12.1 -kafkaaddrs 192.168.11.13:32768,192.168.11.13:32769 -username admin -password xyz -subscribe /interfaces/interface[name=Ethernet1]/state/counters,/interfaces/interface[name=Ethernet1]/config
I0928 12:47:12.907581 10559 main.go:34] Connected to Kafka brokers at [192.168.11.13:32768 192.168.11.13:32769]
I0928 12:47:12.907640 10559 main.go:56] Initialized Kafka producer for 192.168.12.1
I0928 12:47:12.909059 10559 client.go:46] Connected to 192.168.12.1:6030
I0928 12:47:12.909180 10559 client.go:107] Sending subscribe request: subscribe:<subscription:<path:<element:"" element:"interfaces" element:"interface[name=Ethernet1]" element:"state" element:"counters" > > >
I0928 12:47:12.909472 10559 client.go:107] Sending subscribe request: subscribe:<subscription:<path:<element:"" element:"interfaces" element:"interface[name=Ethernet1]" element:"config" > > >
On the switch (192.168.12.1) I issue commands to transmit packets from Et1 or shut down the interface,
vEOS1#show proc | inc OpenConfig
10879 0.1 2.1 ? Sl 12:14:05 00:00:04 /usr/bin/OpenConfig -grpcaddr=192.168.12.1:6030
15559 0.0 0.0 ? Ss 12:56:36 00:00:00 grep -aEe OpenConfig
vEOS1#show log
...
Sep 28 12:49:36 vEOS1 ConfigAgent: %SYS-5-CONFIG_E: Enter configuration mode from console by admin on vty3 (192.168.12.249)
Sep 28 12:49:39 vEOS1 Ebra: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet1, changed state to down
Sep 28 12:49:51 vEOS1 Ebra: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet1, changed state to up
Sep 28 12:49:52 vEOS1 ConfigAgent: %SYS-5-CONFIG_I: Configured from console by admin on vty3 (192.168.12.249)
but the switch sends no telemetry messages from port 6030 to the kafka producer, hence no kafka messages from producer to brokers.
With my revision, the switch throws telemetry messages from port 6030 to the producer, which in turn sends kafka messages to brokers.
I0928 14:13:02.856462 6584 encoder.go:74] kafka: {"path":"/","timestamp":1538111582845,"updates":{"/interfaces/interface[name=Ethernet1]/state/counters/out-octets":"6360047"}}
I0928 14:13:02.856498 6584 producer.go:119] Message produced to Kafka: &{ockafka 192.168.12.1 [123 34 112 97 116 104 34 58 34 47 34 44 34 116 105 109 101 115 116 97 109 112 34 58 49 53 51 56 49 49 49 53 56 50 56 52 53 44 34 117 112 100 97 116 101 115 34 58 123 34 47 105 110 116 101 114 102 97 99 101 115 47 105 110 116 101 114 102 97 99 101 91 110 97 109 101 61 69 116 104 101 114 110 101 116 49 93 47 115 116 97 116 101 47 99 111 117 110 116 101 114 115 47 111 117 116 45 111 99 116 101 116 115 34 58 34 54 51 54 48 48 52 55 34 125 125] [] {2018-09-28 14:13:02.845216743 +0900 JST 1} 0 0 0001-01-01 00:00:00 +0000 UTC 0 0 <nil>}
I0928 14:13:02.856587 6584 encoder.go:74] kafka: {"path":"/","timestamp":1538111582845,"updates":{"/interfaces/interface[name=Ethernet1]/state/counters/out-unicast-pkts":"29168"}}
I0928 14:13:02.856597 6584 producer.go:119] Message produced to Kafka: &{ockafka 192.168.12.1 [123 34 112 97 116 104 34 58 34 47 34 44 34 116 105 109 101 115 116 97 109 112 34 58 49 53 51 56 49 49 49 53 56 50 56 52 53 44 34 117 112 100 97 116 101 115 34 58 123 34 47 105 110 116 101 114 102 97 99 101 115 47 105 110 116 101 114 102 97 99 101 91 110 97 109 101 61 69 116 104 101 114 110 101 116 49 93 47 115 116 97 116 101 47 99 111 117 110 116 101 114 115 47 111 117 116 45 117 110 105 99 97 115 116 45 112 107 116 115 34 58 34 50 57 49 54 56 34 125 125] [] {2018-09-28 14:13:02.845235652 +0900 JST 1} 0 0 0001-01-01 00:00:00 +0000 UTC 0 0 <nil>}
I0928 14:13:02.867572 6584 encoder.go:74] kafka: {"path":"/","timestamp":1538111582856,"updates":{"/interfaces/interface[name=Ethernet1]/state/counters/in-unicast-pkts":"29169"}}
I0928 14:13:02.867589 6584 producer.go:119] Message produced to Kafka: &{ockafka 192.168.12.1 [123 34 112 97 116 104 34 58 34 47 34 44 34 116 105 109 101 115 116 97 109 112 34 58 49 53 51 56 49 49 49 53 56 50 56 53 54 44 34 117 112 100 97 116 101 115 34 58 123 34 47 105 110 116 101 114 102 97 99 101 115 47 105 110 116 101 114 102 97 99 101 91 110 97 109 101 61 69 116 104 101 114 110 101 116 49 93 47 115 116 97 116 101 47 99 111 117 110 116 101 114 115 47 105 110 45 117 110 105 99 97 115 116 45 112 107 116 115 34 58 34 50 57 49 54 57 34 125 125] [] {2018-09-28 14:13:02.856504306 +0900 JST 1} 0 0 0001-01-01 00:00:00 +0000 UTC 0 0 <nil>}
I0928 14:13:02.867656 6584 encoder.go:74] kafka: {"path":"/","timestamp":1538111582856,"updates":{"/interfaces/interface[name=Ethernet1]/state/counters/in-octets":"6360848"}}
I0928 14:13:02.867670 6584 producer.go:119] Message produced to Kafka: &{ockafka 192.168.12.1 [123 34 112 97 116 104 34 58 34 47 34 44 34 116 105 109 101 115 116 97 109 112 34 58 49 53 51 56 49 49 49 53 56 50 56 53 54 44 34 117 112 100 97 116 101 115 34 58 123 34 47 105 110 116 101 114 102 97 99 101 115 47 105 110 116 101 114 102 97 99 101 91 110 97 109 101 61 69 116 104 101 114 110 101 116 49 93 47 115 116 97 116 101 47 99 111 117 110 116 101 114 115 47 105 110 45 111 99 116 101 116 115 34 58 34 54 51 54 48 56 52 56 34 125 125] [] {2018-09-28 14:13:02.856522943 +0900 JST 1} 0 0 0001-01-01 00:00:00 +0000 UTC 0 0 <nil>}
WRT your comment:
We haven't dropped the older protobuf support in OpenConfig or TerminAttr yet hence we haven't updated all the tools yet.
But you guys not only deprecated occli in favor of gnmi a year ago, but have also updated other commands such as ocprometheus, ocredis, and ocsplunk just recently. Why not updating ockafka as well?
With the current ockafka code in your master branch, Arista switch (vEOS 4.20.1F) establishes connection with the kafka producer, but doesn't send messages back to it at all.
I think the issue is not between ockafka and Kafka, but between ockafka and OpenConfig. It looks like you're hitting a bug that affects the server for the old protobuf, the workaround should be to pop up one level and subscribe to all keys (i.e., not specify a key). Can you try that?
But you guys not only deprecated occli in favor of gnmi a year ago, but have also updated other commands such as ocprometheus, ocredis, and ocsplunk just recently. Why not updating ockafka as well?
We're planning to, just didn't get to it yet. Thank you for your contribution for it, I left you 2 comments on the PR.
Yes I meant to say the issue was between ockafka and OpenConfig, and your observation is correct. I tried the following and it worked.
ockafka -addrs 192.168.12.1 -kafkaaddrs 192.168.11.13:32768,192.168.11.13:32769 -username admin -password xyz -subscribe /interfaces
We're planning to, just didn't get to it yet. Thank you for your contribution for it, I left you 2 comments on the PR.
As for your two comments... I saw you had already commited a77bf0c, fb622b9, and ff33da2, with these changes they have all dropped the ability to connect multiple servers. So in my view, having a working gNMI collector for kafka, which is free from the aforementioned protobuf bug, is one valid step, and regaining multiple server connection capability for all telemetry destination adapters by making changes to gnmi client code should be the next step.
I'd appreciate your opinion.
Hi @tamihiro, thank you for doing the migration. Our recommendation is usually to run these tools directly on the switch so we're probably ok losing the -addrs
option. We don't have the infrastructure to accept external PRs yet, so I'll port your change internally once it's passing and point to this. Thanks again!
@7AC thanks so much for the review! It doesn't matter at all that my PR is not directly accepted, I appreciate you working on it internally.
I run octsdb on a linux server to get streaming data from the switch running TerminAttr so that I can pass it back directly to InfluxDB via their opentsdb input plugin. Since https://github.com/aristanetworks/goarista/commit/fb622b9b46608fdb39d36447f4d8ef52fe37fc3d#diff-0a86d997d2c66282031326a80d166c4e I can't run 1 single octsdb instance against multiple switches.
Example: -addrs switch1:6042,switch2:6042. -addr doesn't seem to support multiple comma separated values. Which means I need to run a separate octsdb instance for each switch
Hi @Gelob, have you considered running the tool on the device itself? That's usually what we recommend. If you really want multiple addresses feel free to open a separate issue, I will use this one to track migrating ockafka to gNMI.
@Gelob please open another issue, so we can track demand for that, and then we can consider bringing that feature back.
Hi @7AC I figured that you guys decided to work on my PR internally a while back. May I ask how it's progressing? Please let me know if you want me to do anything on my side again.
@tamihiro sorry for the delay. The fastest route is definitely you taking addressing the comments on the review and getting CI to pass, then we can merge internally. Second fastest route is if you reach out to your SE and ask for alternatives. We will make the switch to gNMI but I can't guarantee it will be prioritized until we remove support for the old protobuf.
@7AC as per your suggestion I have updated producer_test.go
and done some more fixes to get travis CI to pass. I don't get why you changed the title of this issue but anyway, please let me know if there's still anything else I need to do before it gets merged.
Thank you for the update, I left you another comment. I renamed the issue because "doesn't work as is" is too broad IMHO, it just doesn't work in the case where you want to retrieve a specific key as opposed to the entire collection.
Is your another comment this?
I think we can keep the interface generic the way it was intended, so we don't couple ourselves with the types of messages we're handling.
I'm afraid I don't understand.. could you please elaborate?
I now understand the change in the title of the issue, thanks. As a network operator, I'm seeing the current code, due to the bug you previously mentioned that doesn't allow specifying keys for which the user only needs the values, is IMHO not really usable.
Is your another comment [this]
Sorry I didn't realize I didn't submit the comment, done now it was this one I was referring to:
https://github.com/aristanetworks/goarista/pull/22#discussion_r234832013
I think we can keep the interface generic the way it was intended, so we don't couple ourselves with the types of messages we're handling.
I'm afraid I don't understand.. could you please elaborate?
The interface purposely has proto.Message as opposed to any specific implementation of it (e.g., gnmi) so it doesn't depend on the transport. I'd like to keep it that way.
I now understand the change in the title of the issue, thanks. As a network operator, I'm seeing the current code, due to the bug you previously mentioned that doesn't allow specifying keys for which the user only needs the values, is IMHO not really usable.
I agree there's an issue, just not as broad as "the tool doesn't work". There's a specific feature that's broken and there is a workaround (https://github.com/aristanetworks/goarista/issues/23#issuecomment-425542300), I'm afraid other users could read "doesn't work as is" and think it doesn't start or doesn't get any data.
@7AC I'm a little puzzled with your last comment, and admittedly I'm not even rermotely an expert on any of these protocols... pls check my response, and if you could tell me what I'm missing I'd appreciate it.
I tried it out, but kafka producer didn't publish a message. Without looking deep into the cause of the glitch, I thought I'd better use newish gnmi code like those for other storage (ocredis, ocsplunk, etc.). So I did, and it looks working. Please review my PR.