Closed bdarwin closed 6 years ago
I don't understand how are closed cache and network issue related. Could you please provide more details (logs, steps to reproduce)? Ideally to have a reproducer.
It so happened that when there was a network outage I saw this message in one of the the node logs and cache is closed. May be it's not related to network at all. But what does the this exception mean? Why node is stopping if it can't get topology update? I will get a sample soon.
I think that node was stopped due to network segmentation. You can turn on logs on DEBUG level for org.apache.ignite.spi.discovery.tcp
package and grep for SEGMENTED
pattern.
When there is a network issue and a node can't reach another node for quite sometime, i see the below error and cache is closed.
2017-02-22 04:14:58.922 vert.x-worker-thread-1 INFO ignite.IgniteClusterManager - javax.cache.CacheException: class org.apache.ignite.IgniteCheckedException: Failed to wait for topology update, cache (or node) is stopping.
Once cache is closed the node is literally dead because it cant get the vertx sub maps and can't communicate anymore, I have to restart the node.
Is there something obvious I am missing here? or this is how it works?
I found below open tickets, not exactly what I am looking for but sounds similar.
https://issues.apache.org/jira/plugins/servlet/mobile#issue/IGNITE-2766 https://issues.apache.org/jira/plugins/servlet/mobile#issue/IGNITE-3616
This can be easily reproducible by blocking any event bus thread by say few minutes.