In ObjectCache, ClearSerializationCache() uses Clear() on the serialization cache Dictionary. This is an O(n) operation, where n is the capacity of the Dictionary. ClearSerializationCache() is called near the end of Serialize().
Using a CerasSerializer instance to serialize an object with many references may cause the capacity of the Dictionary to increase significantly. If we then use the same CerasSerializer instance to serialize objects with very few references, the serialization processing time can be significantly slower than expected due to the larger capacity of the Dictionary affecting Clear().
Simply replacing Clear() with creating a new Dictionary instance has a negative impact on the serialization processing time for repeatedly serializing objects with very few references. However, using Clear() when the Count of the Dictionary is less than or equal to the initial capacity of the Dictionary and creating a new Dictionary instance when the Count is greater seems to work well. This has the effect of keeping a consistent Dictionary capacity between calls to Serialize().
Edit: I've run a few tests. Here, "Current" refers to the current method of using Clear() and "Proposed" refers to the method described above. The source code has been attached.
Test 1: 1,000,000 calls to Serialize() for an object with few references.
Current: 4763ms
Proposed: 4793ms
Test 2: 1,000 calls to Serialize() for an object with many references.
Current: 5812ms
Proposed: 6785ms
Test 3: 1,000,000 calls to Serialize() for an object with few references after a call to Serialize() for an object with many references.
Current: 13952ms
Proposed: 4778ms
The proposed method does fix the issue but has a negative impact on the serialization processing time for repeatedly serializing objects with many references. A configuration option to preserve the object cache capacity between calls to Serialize() by always using Clear() would provide a way of optimizing for this use case.
In ObjectCache, ClearSerializationCache() uses Clear() on the serialization cache Dictionary. This is an O(n) operation, where n is the capacity of the Dictionary. ClearSerializationCache() is called near the end of Serialize().
Using a CerasSerializer instance to serialize an object with many references may cause the capacity of the Dictionary to increase significantly. If we then use the same CerasSerializer instance to serialize objects with very few references, the serialization processing time can be significantly slower than expected due to the larger capacity of the Dictionary affecting Clear().
Simply replacing Clear() with creating a new Dictionary instance has a negative impact on the serialization processing time for repeatedly serializing objects with very few references. However, using Clear() when the Count of the Dictionary is less than or equal to the initial capacity of the Dictionary and creating a new Dictionary instance when the Count is greater seems to work well. This has the effect of keeping a consistent Dictionary capacity between calls to Serialize().
Edit: I've run a few tests. Here, "Current" refers to the current method of using Clear() and "Proposed" refers to the method described above. The source code has been attached.
Test 1: 1,000,000 calls to Serialize() for an object with few references. Current: 4763ms Proposed: 4793ms
Test 2: 1,000 calls to Serialize() for an object with many references. Current: 5812ms Proposed: 6785ms
Test 3: 1,000,000 calls to Serialize() for an object with few references after a call to Serialize() for an object with many references. Current: 13952ms Proposed: 4778ms
The proposed method does fix the issue but has a negative impact on the serialization processing time for repeatedly serializing objects with many references. A configuration option to preserve the object cache capacity between calls to Serialize() by always using Clear() would provide a way of optimizing for this use case.
Form1.vb.txt