Open MkazemAkhgary opened 5 years ago
... for many small immutable objects
How small? Would they be better implemented as struct
s? (In which case you wouldn't need a pool)
@Clockwork-Muse Thanks. I thought about that too but they are not fixed size. length can vary from 3 bytes to more so immutable array is used plus an integer that is pre-computed hashcode of that immutable array. how ever smaller objects will repeat a lot whereas larger objects are unlikely to be equal.
If I understand right, you want List<WeakReference<T>>
but the WeakReference<T>
type has too much overhead in memory, and since WeakReference<T>
does not hold any extra state (at least on the managed side) you would like a way to indicate that everything in an array should be treated as weakly referenced? Yes, there is no way to do that I think. It would need a fair bit of runtime work.
In your scenario, could you group them in smaller pools, so you could have several WeakReference<List<T>>
- when none of the entries in that pool have a strong reference, they and the list would be collected.
@danmosemsft Nice idea. never thought about it. seems like the best solution with current limitations. I think grouping them by fraction of the hashcode would work.
ConcurrentDictionary<int, WeakReference<ConcurrentDictionary<T, T>>>
where dictionary is mapped by 1 or 2 bytes of the hashcode of T
. I would dedicate 1 bit of the hashcode to denote if T
is interned or not. (and i must exclude this bit from GetHashCode as well)
finalizer of T
will then check if finalizing object is the interned one or not. if it is interned, it will then remove itself from the pool.
You might have to deal with fragmentation somehow - do periodic passes and coalesce them. Depending on what you're doing, you might have generations, and do the passes over the groups that you handed out longest ago.
Another idea is a bounded set that are strongly referenced, and handed out in preference to the weakly referenced ones. Those at least would be stored more compactly.
I'm making all this up though, hard to say much without detail of your usage patterns and all kinds of things.
What about using GCHandle
with GCHandleType.Weak
instead of WeakReference<T>
?
Unfortunately there is no way to achieve this. smaller WeakReference<ConcurrentDictionary<T, T>>
pools always get cleared. It actually makes sense because the collection has only a weak reference.
I also tried GCHandle.Alloc
with GCHandleType.Weak
and target collection gets cleared even without calling Free
.
I don't know if WeakTrackResurrection
would help if I somehow implement object resurrection via a custom collection finalizer.
@MkazemAkhgary I meant having a collection of GCHandle
s instead of a collection of WeakReference
s.
This is my current implementation that doesn't work. Target of GCHandle
becomes null before calling Free
internal sealed class ObjectPool<T> where T : class
{
private readonly ConcurrentDictionary<int, GCHandle> _pool;
private readonly IEqualityComparer<T> _equalityComparer;
private readonly object _sync;
public ObjectPool(IEqualityComparer<T> equlityComparer = null)
{
_pool = new ConcurrentDictionary<int, GCHandle>();
_equalityComparer = equlityComparer;
_sync = new object();
}
private int GetKey(T item)
{
// combine 4 bytes to single byte.
var hash = item.GetHashCode();
return (((hash & 0xFF) ^ ((hash >> 16) & 0xFF)) ^ (((hash >> 8) & 0xFF) ^ ((hash >> 24) & 0xFF)));
}
private bool TryGetTarget(GCHandle handle, out ConcurrentDictionary<T, T> target)
{
lock (_sync)
{
target = null;
if (handle.IsAllocated)
{
target = (ConcurrentDictionary<T, T>)handle.Target;
Debug.Assert(target != null); // this will fail !
return true;
}
return false;
}
}
/// <summary>
/// interns the given object and returns the interned reference.
/// </summary>
public T Intern(T item)
{
var handle =_pool.GetOrAdd(GetKey(item),
_ => GCHandle.Alloc(new ConcurrentDictionary<T, T>(_equalityComparer), GCHandleType.Weak));
return TryGetTarget(handle, out var target)
? target.GetOrAdd(item, x => x)
: Intern(item); // try to intern again.
}
/// <summary>
/// release the given object if its interned.
/// </summary>
public void Release(T item)
{
var key = GetKey(item);
if(_pool.TryGetValue(key, out var handle)
&& TryGetTarget(handle, out var target)
&& target.TryGetValue(item, out var x)
&& ReferenceEquals(item, x)
&& target.TryRemove(item, out _)
&& target.Count == 0)
{
lock (_sync)
{
_pool.TryRemove(key, out _);
handle.Free();
}
}
}
}
And this is the class that is being interned. finalizer of these objects are called with GCHandleType.Weak
mode but fail miserably :(
public abstract class Message : IEquatable<Message>
{
/// <summary>
/// Gets entire content of message as immutable array.
/// </summary>
public ImmutableArray<byte> RawData { get; }
/// <summary>
/// precomputed hash code for immutable array. LSB is reserved for intern flag.
/// if LSB is 1, object is most likely interned, if LSB is 0 object is definitely not interned.
/// </summary>
private int _hashCode;
private protected Message(ImmutableArray<byte> rawData)
{
RawData = rawData;
// calculate hashcode.
_hashCode = ObjectExtensions.CombineHashCodes(
GetType().GetHashCode(), // this will help increase distribution.
ImmutableArrayEqualityComparer<byte>.Default.GetHashCode(RawData));
// to increase distribution XOR MSB with LSB before reserving LSB for intern flag.
_hashCode = (_hashCode & ~1) ^ (_hashCode << 31);
}
~Message()
{
// if interned, clear flag and release this reference.
if((Interlocked.CompareExchange(ref _hashCode, _hashCode & ~1, _hashCode | 1) & 1) == 1)
{
MessageFactory.Release(this);
}
}
public Message Intern()
{
var x = MessageFactory.Intern(this);
if (x == this) _hashCode |= 1; // set intern flag
return x;
}
// ... equality comparers
public override int GetHashCode()
{
// copy MSB to LSB because LSB is reserved for intern flag.
return unchecked((_hashCode & ~1) | (int)((uint)_hashCode >> 31));
}
}
Ok. I managed to get this working with this class
private sealed class X : ConcurrentDictionary<T, T>
{
public bool resurrect = true;
public X(IEqualityComparer<T> e) : base(e)
{
}
~X()
{
if(resurrect) GC.ReRegisterForFinalize(this);
}
}
And with GCHandleType.WeakTrackResurrection
.
Before i release the handle, i will set resurrect
to false.
How ever 1 byte for grouping isn't good enough for my usecase because there will be maximum of 256 collections and there are about 10 thousand interned objects. I guess grouping by 10 bit would be good.
With object resurrection i can really have one WeakReference<Collection<T>>
collection and it does exactly what I wanted. although not by the same meaning, here items of the collection have weak reference, but the collection is forced to be alive. assuming that true
is passed to the WeakReference
constructor.
The only problem with this is the collection goes in LOH very quickly and GC wont collect it anytime soon. it wont collect the containing items either until GC starts to collect generation 2.
Splitting the pool into smaller pools will indeed help to solve this problem and get rid of items more rapidly. but it will complicate the process very much.
I also tried
GCHandle.Alloc
withGCHandleType.Weak
and target collection gets cleared even without callingFree
.
This is indeed the (poorly documented) behavior of GCHandleType.Weak
. The main problem with rolling your own weak reference approach with GCHandle
is that there's no obvious opportunity for you to free unwanted handles themselves, especially absent any way to be notified when a respective Target
gets zeroed out. The latter happens elsewhere, actually, so not a single bit changes in the 32- or 64-bit value-type GCHandle
type you're looking at. Perversely enough, it follows that unless your code wants to periodically visit all the handles you've scattered about "round-robin style" to check up on them, you won't discover a vacuous handle until... the exact moment that you actually need to once again restore its usefulness.
So again, yes, don't let the zeroing fool you, even when they are created in GCHandleType.Weak
mode, every GCHandle
should indeed have Free()
explicitly called on it--if you're able to figure out how to programmatically decide that it won't become relevant or useful again before some certain period of time, that is.
The upside is that you can re-use a zeroed entry, by setting some new Object
reference into its Target
, thus recycling an existing handle for some savings perhaps.
But back to the downside, however, there was not enough thought put into the concurrency requirements for conveniently deploying this re-use scenario. Basically there's no (squeaky-clean, universally lauded and sanctioned) way to either (a.) atomically swap the entire GCHandle
using Interlocked.CompareExchange
, or (b.) atomically swap just the Target
on an existing GCHandle
, in-situ. The race that's implied by the lack of help here is quite a wet-blanket on what should be a very simple lock-free recycling pattern, because in the end it becomes simpler to just use a multi-phase race of first zeroing-out any orphaned GCHandle
instance you encounter, and then immediately race again to install an update. Boo.
I mention the above for the benefit of the esteemed rule-followers, because in fact the following trivial IL has suited all my uses perfectly enough:
.method public static GCHandle CmpXchg(GCHandle& addr, GCHandle val, GCHandle expect) aggressiveinlining
{
ldarg addr
ldarg val
ldarg expect
call native int Interlocked::CompareExchange(native int&, native int, native int)
ret
}
What I need:
There is no way to have something like
WeakReference<List<T>>
where the list itself has strong reference but all of its content have weak references.This would be useful for having many small objects with weak references.
List<WeakReference<T>>
provides little to no benefit and may even have bad outcomes since objects are too small and weak reference may take same or more space than the target itself.In short, I want a list with strong reference on it self but weak reference for its contents.
What I'm trying to do in case you are wondering:
I want to implement flyweight pattern for many small immutable objects.
for interning objects, an object it is retrieved or added to the dictionary (using
ConcurrentDictionary.GetOrAdd
). but the problem is that these objects never get cleared.I wish i could have a collection like
WeakReference<ConcurrentDictionary<K, V>>
where the dictionary itself has strong reference but its content (keys/values) have weak references so that entries get cleared by GC when there is no strong reference to those interned objects.If there was a way to tell GC to not count references as strong references in a particular array, problems would be solved. for example it could be denoted by an attribute.
Or better handled by special type.