Closed ameihm0912 closed 4 years ago
Just to make sure I understand, is the state data staying around for the lifetime of the pipeline just data sitting in memory taking up space or are there other potential problems with this data not expiring?
The state is cleared upon window closure, but since we are using global windows here which will not close until the pipeline exits it will just grow in size using resources which are never reclaimed. So, clearing this when we can is the main goal with this. In terms of other potential problems beyond just resource usage I think there might be but I am not sure. The state keys influence shuffle behavior leading up to the step, so I suspect there could also be implications with how this occurs if we have tons of unused state keys.
AlertSuppressor and derived classes currently make use of global window state, but the state values are never cleared. Add a timer hook here to clear old state data so it doesn't end up hanging around for the lifetime of the pipeline.