Open doanac opened 5 years ago
Turning off the gateway, I can get things a little more quiet and the log looks like:
I|2019-05-08 13:03:06,265|o.a.k.c.c.i.ConsumerCoordinator|Setting newly assigned partitions [DeviceUpdateEvent-ce-0] for group device-registry-DeviceUpdateEvent-ce
E|2019-05-08 13:03:06,343|akka.stream.Materializer|[DeviceUpdateEvent.listener] Upstream failed.
com.advancedtelematic.libats.http.Errors$RawError: device doesn't exist
E|2019-05-08 13:03:06,348|c.a.l.m.d.MessageBusListenerActor|Source/Listener died, subscribing again
com.advancedtelematic.libats.http.Errors$RawError: device doesn't exist
I|2019-05-08 13:03:11,365|c.a.l.m.d.MessageBusListenerActor|Subscribing to DeviceUpdateEvent
Turning off all services (director, treehhub, tuf-keyserver, tuf-reposerver and the *-daemon) does not fix things, so it seems like there's something stuck in kafka?
out of desperation i did the following:
kafka-topics --delete --topic DeviceUpdateEvent-ce --zookeeper zookeeper
The device-registry seems to be happy again.
Found on version: device-registry:0.3.0-12-ga1ea92b
We have at least one device in our OTA Connect deployment who's status seems to be flapping between UpToDate and Outdated.
When looking at the logs I see the following messages repeating over and over. I have a feeling something here is the cause of the device status not looking correct, but I have no idea. It almost seems like there's a message in the queue we can't process and are hitting it over and over and over again.
Any ideas how we can get unstuck?