Closed Andy-42 closed 2 years ago
I realize that I actually do the wrong thing here. I am comparing NodeChangeEvent for equality, but it needs to compare on edge, key, etc. I'll fix that.
I fixed the implementation of NodeActor.deduplicateEvents
so that it reproduces the original behaviour accurately. This is more verbose than the original, but I think it reads more like a specification.
This should definitely have some unit testing!
There is a much smaller change that might improve performance without any logical change in the code at all. The Sets es
, ps
and min
are all immutable sets. Each time we +=
an element, this is going to create a new Set instance. Even if the immutable set implementation is clever about patching and reusing the existing set, this is not going to be as efficient as adding elements to a mutable set.
These sets are definitely not going to escape the scope, using mutable sets in this context is justified.
I would do that, along with pulling this out into a separate method so that it can be unit tested.
In trying to understand the heap allocations in this code, I noticed something that I would like to drill into at some point. The parameter in question is defined as events: Seq[NodeChangedEvent]
. I had inferred that the argument would actually be a List
(by looking at all the calls to this method).
The pattern matching on events
as a way of iterating over it would perform well if it is a List
, and less well if it is a Vector
. In general, understanding the heap allocations requires knowing what the implementation is.
I'm sure there are other issues that come into play (e.g., Vector
co-locates in cache better, less pointer chasing).
Is that an argument for making a parameter like events
specifically List
or Vector
- at least in a hot code path like this? You could probably optimize it either way, but is using Seq
a barrier to reasoning about performance?
Thanks for all the comments. I really only intended this PR for the purpose of discussion. Maybe there is something in it, but I don't think it is justified at this point.
I'm going to close this PR and do the much simpler change (using mutable sets and adding unit tests).
I started over and extracted a dedupEvents
function out of the body of NodeActor.processEvents
, this time keeping the existing algorithm (i.e., using Set
s instead of searching through later events). The only change that I made was to eliminate the two reverses.
def dedupEvents(events: Seq[NodeChangeEvent],
eventTime: () => EventTime,
hasEffect: NodeChangeEvent => Boolean
): Seq[WithTime] = {
var es: Set[HalfEdge] = Set.empty
var ps: Set[Symbol] = Set.empty
var ft: Option[QuineId] = None
var mih: Set[QuineId] = Set.empty
def anOverridingEventOccursLater(event: NodeChangeEvent): Boolean = event match {
case EdgeAdded(ha) => if (es.contains(ha)) true else { es += ha; false }
case EdgeRemoved(ha) => if (es.contains(ha)) true else { es += ha; false }
case PropertySet(k, _) => if (ps.contains(k)) true else { ps += k; false }
case PropertyRemoved(k, _) => if (ps.contains(k)) true else { ps += k; false }
case MergedIntoOther(id) => if (ft.isDefined) false else { ft = Some(id); false }
case MergedHere(o) => if (mih.contains(o)) false else { mih += o; false }
}
val indexableEvents = events.toIndexedSeq
@tailrec
def accumulate(r: List[WithTime] = Nil, i: Int = indexableEvents.length - 1): Seq[WithTime] =
if (i < 0)
r
else {
val e = indexableEvents(i)
if (!anOverridingEventOccursLater(e) && hasEffect(e))
accumulate(r = WithTime(e, eventTime()) :: r, i = i - 1)
else
accumulate(r = r, i = i - 1)
}
accumulate()
}
For a large number of events, converting to an IndexedSeq makes a difference, and for a small number of events I couldn't see any difference. That .toIndexedSeq
is approximately the equivalent of the first reverse, so this way of doing it achieves the original goal in the TODO of eliminating one of the reverses.
I didn't compare this implementation to the original, but I don't see any indication that there is a compelling performance improvement by eliminating one reverse. This seems like premature optimization.
A better motivation for pulling this function out of NodeActor (and into a companion object) is that it enables unit testing.
This change extracts a function for deduplicating events out of NodeActor.processEvents.
Since this code is likely to be called frequently, minimizing heap allocations could have performance benefits. This implementation attempts to minimize allocations by:
Since the Sets that track state are not built, deduplication needs to linear search the later events. This could be problematic for very large sequences of event. That would have to be a very large sequence of events to be significant.
It returns a
mutable.ArrayBuffer
as aSeq
without coping. Since this is only used internally to NodeActor, there is no possibility for this mutable state to escape, so there is no need to do the defensive copy.