The AliasReslover includes a mergedHosts field which keeps track of all the host mappings it has seen in its history. By design, it currently has no way of removing elements that it's seen before. This is so that hosts that are seen once again after not being seen for quite a while are resolved correctly.
I'm currently working on decoupling the alias resolution from any class that holds the full state of the graph, so it's not trivial to just prune old mergedHosts contents when hosts are fully removed from the graph. It could be possible to have a feedback loop from the later pruning step. This would allow us to intercept vertex removal events and keep the content mergedHosts reasonably in line with what the graph currently holds, but it does have the potential to lose useful alias data.
It's likely that we'll have to find a tradeoff, such as dropping host mappings after a certain amount of inactivity (in time or measurement count).
We could perhaps change how we use CheckpointedFunction with the AliasResolver to take advantage of Keyed State TTLs, so that the timeout behaviour is handled directly by Flink.
The AliasReslover includes a
mergedHosts
field which keeps track of all the host mappings it has seen in its history. By design, it currently has no way of removing elements that it's seen before. This is so that hosts that are seen once again after not being seen for quite a while are resolved correctly.I'm currently working on decoupling the alias resolution from any class that holds the full state of the graph, so it's not trivial to just prune old
mergedHosts
contents when hosts are fully removed from the graph. It could be possible to have a feedback loop from the later pruning step. This would allow us to intercept vertex removal events and keep the contentmergedHosts
reasonably in line with what the graph currently holds, but it does have the potential to lose useful alias data.It's likely that we'll have to find a tradeoff, such as dropping host mappings after a certain amount of inactivity (in time or measurement count).