Open yuriykulikov opened 4 years ago
Hello,
retainAll calls Collection.contains(). The complexity of contains() is O(1) or O(logN) for sets and O(n) for list.
So, to be honest:
I was surprised that (0..147853).toList().intersect((0..147853).toList()) takes only milliseconds
I was not surprised that (0..147853).toList().intersect((0..147853).toPersistentList()) takes minutes.
But the implementation of MutableCollection.retainAll(elements: Iterable) tries to be smart: in some cases, 'elements' is converted to a set and retainAll is applied using this set. It explains why the test case with two lists is so fast.
This behavior is handled by the following code from Iterables.kt
/** Returns true when it's safe to convert this collection to a set without changing contains method behavior. */
private fun <T> Collection<T>.safeToConvertToSet() = size > 2 && this is ArrayList
/** Converts this collection to a set, when it's worth so and it doesn't change contains method behavior. */
internal fun <T> Iterable<T>.convertToSetForSetOperationWith(source: Iterable<T>): Collection<T> =
when (this) {
is Set -> this
is Collection ->
when {
source is Collection && source.size < 2 -> this
else -> if (this.safeToConvertToSet()) toHashSet() else this
}
else -> toHashSet()
}
When 'this' is a persistent list, it is a collection but not an array list, so safeToConvertToSet() returns false and we don't do the conversion to hash set.
This is only an analysis, I don't have any solution for now.
Iterable.intersect(other: Iterable) takes a very long time to complete when called with a PersistentList as a parameter. Same function works faster with other iterables like List and Set. It is minutes with PersistentList and milliseconds with List.
I couldn't find the exact reason for that, but it seems that Collection.retainAll does something with the persistent list which takes ages to complete.
Here are some examples: