I encountered unexpected growth in memory usage while working on an library that uses PersistentList. I eventually realized that it is because I had many distinct instances of an empty PersistentList.
I traced down the problem, starting with toPersistentList(), which I was using to convert all inputs before creating assembling it into my data model. The toPersistentList() function takes the empty persistent list and adds all elements of this to the empty list. When you add elements to a SmallPersistentVector, it checks whether the combined size is less than or equal to MAX_BUFFER_SIZE and if it is, it unconditionally creates a new SmallPersistentVector instance.
This means that no-ops on small lists (such as adding two empty lists) always results in a new instance being created.
I verified that the problem only exists for SmallPersistentVector. PersistentVector uses PersistentVectorBuilder, which already has checks for adding an empty collection.
What I am changing
I have added a check to fun addAll(elements: Collection<E>): PersistentList<E> and fun addAll(index: Int, c: Collection<E>): PersistentList<E> to return early if the collection of elements to be added is empty.
I have updated the tests to include missing no-op cases related to adding and removing empty collections in both ImmutableListTest and ImmutableSetTest, and I have added tests for toPersistentMap(), toPersistentSet(), and toPersistentList() to ensure that when the receiver is an empty map/collection, they always return the same instance of persistent map/set/list.
Problem
I encountered unexpected growth in memory usage while working on an library that uses
PersistentList
. I eventually realized that it is because I had many distinct instances of an emptyPersistentList
.I traced down the problem, starting with
toPersistentList()
, which I was using to convert all inputs before creating assembling it into my data model. ThetoPersistentList()
function takes the empty persistent list and adds all elements ofthis
to the empty list. When you add elements to aSmallPersistentVector
, it checks whether the combined size is less than or equal toMAX_BUFFER_SIZE
and if it is, it unconditionally creates a newSmallPersistentVector
instance.https://github.com/Kotlin/kotlinx.collections.immutable/blob/7fb0d74ea87d1f52e8d30d39d4df066a57f51e50/core/commonMain/src/implementations/immutableList/SmallPersistentVector.kt#L38-L49
This means that no-ops on small lists (such as adding two empty lists) always results in a new instance being created.
I verified that the problem only exists for
SmallPersistentVector
.PersistentVector
usesPersistentVectorBuilder
, which already has checks for adding an empty collection.What I am changing
I have added a check to
fun addAll(elements: Collection<E>): PersistentList<E>
andfun addAll(index: Int, c: Collection<E>): PersistentList<E>
to return early if the collection of elements to be added is empty.I have updated the tests to include missing no-op cases related to adding and removing empty collections in both
ImmutableListTest
andImmutableSetTest
, and I have added tests fortoPersistentMap()
,toPersistentSet()
, andtoPersistentList()
to ensure that when the receiver is an empty map/collection, they always return the same instance of persistent map/set/list.