Closed ryanmolden closed 3 years ago
Also add this to the 'difference' list, not sure who is right here, CLRMD or SOS (method table is 0x2B5DE20C, same dump)
CLRMD: 0x2B5DE20C = System.Linq.Enumerable+\<ReverseIterator>d75
It looks like it is the auto-created object for a yield enumerable inside the static Enumerable.ReverseIterator
Here's a standalone repro:
using Microsoft.Diagnostics.Runtime;
using Microsoft.Diagnostics.Runtime.DacInterface;
using System;
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;
using System.IO;
using System.Linq;
using System.Text;
class Program
{
static void Main(string[] _)
{
using DataTarget dt = DataTarget.LoadDump(@"C:\work\vs\VsLowMemDump4.dmp");
using ClrRuntime runtime = dt.ClrVersions.Single().CreateRuntime();
using SOSDac dac = runtime.DacLibrary.SOSDacInterface;
HashSet<string> seen = new HashSet<string>();
using var output = File.CreateText("output.txt");
foreach (ClrModule module in runtime.EnumerateModules())
{
foreach ((ulong mt, int _) in module.EnumerateTypeDefToMethodTableMap())
{
string name = dac.GetMethodTableName(mt);
if (name is null)
continue;
if (!seen.Contains(name))
{
string result = FixGenerics(name);
output.WriteLine(result);
}
}
}
}
private static void FixGenerics(StringBuilder result, string name, int start, int len, ref int maxDepth, out int finish)
{
if (--maxDepth < 0)
{
finish = 0;
return;
}
int i = start;
while (i < len)
{
if (name[i] == '`')
while (i < len && name[i] != '[')
i++;
if (name[i] == ',')
{
finish = i;
return;
}
if (name[i] == '[')
{
int end = FindEnd(name, i);
if (IsArraySubstring(name, i, end))
{
result.Append(name, i, end - i + 1);
i = end + 1;
}
else
{
result.Append('<');
int curr = i;
do
{
FixGenerics(result, name, curr + 2, end - 1, ref maxDepth, out int currEnd);
if (maxDepth < 0)
{
finish = 0;
return;
}
curr = FindEnd(name, currEnd) + 1;
if (curr >= end)
break;
if (name[curr] == ',')
result.Append(", ");
}
while (curr < end);
result.Append('>');
i = curr + 1;
}
}
else
{
result.Append(name[i]);
i++;
}
}
finish = i;
}
private static int FindEnd(string name, int start)
{
int parenCount = 1;
for (int i = start + 1; i < name.Length; i++)
{
if (name[i] == '[')
parenCount++;
if (name[i] == ']' && --parenCount == 0)
{
return i;
}
}
return -1;
}
private static bool IsArraySubstring(string name, int start, int end)
{
start++;
end--;
while (start < end)
if (name[start++] != ',')
return false;
return true;
}
[return: NotNullIfNotNull("name")]
public static string? FixGenerics(string? name)
{
if (name == null || !name.Contains("[[", StringComparison.Ordinal))
return name;
int maxDepth = 64;
StringBuilder sb = new StringBuilder();
FixGenerics(sb, name, 0, name.Length, ref maxDepth, out _);
if (maxDepth < 0)
return null;
return sb.ToString();
}
}
So I have something that seems to work, it fixes things like this (old output)
System.Collections.Generic.Dictionary<System.String, System.Int64>[]
And correctly outputs this
System.Collections.Generic.Dictionary<System.String, System.Int64>+Entry[]
But I also noticed that sometimes the DAC doesn't give us generic params for generic types, it results in things like this (old output)
NuGet.Shared.EqualityUtility+<>c__DisplayClass3_0`2
This is a lambda generated within a generic method which has two two type params, so the `2 at the end is right, but the DAC doesn't give any type params.
Currently my parser outputs this
NuGet.Shared.EqualityUtility+<>c__DisplayClass3_0<??, ??>
The <??,??> is kind of arbitrary, its just an attempt and not 'losing' the genericness/arity of the type, even though we have no params to fill in.
I am not 100% on this, technically if anyone has finicky parsing logic around the names CLRMD returns (not unlikely) it seems this would break that.
Thoughts @leculver ? Should I just leave these as is (i.e.: NuGet.Shared.EqualityUtility+<>c__DisplayClass3_0`2)?
Go ahead and post the PR even if it's not ready to merge. (Mark as Draft if it's not ready.)
NuGet.Shared.EqualityUtility+<>c__DisplayClass3_0<??, ??>
I like the concept of this, I'm not sure about using ??
specifically. Can you define that ??
as a constant to easily change it?
Actually, I think using numbered arguments would be better: T1
, T2
, etc... It at least holds the intent without being something that breaks the naming scheme of CLR.
Yeah it is already defined as a constant, subbing in the T1,T2 stuff should be easy enough, let me do that.
Examples of differences
Input: Roslyn.Utilities.OrderedMultiDictionary`2
OLD: Roslyn.Utilities.OrderedMultiDictionary`2
NEW: Roslyn.Utilities.OrderedMultiDictionary<T1, T2>
Input: System.Lazy`1+Boxed[[Microsoft.CodeAnalysis.Options.IDocumentOptionsProviderFactory, Microsoft.CodeAnalysis.Workspaces]]
OLD: System.Lazy
NEW: System.Lazy
Input: Microsoft.CodeAnalysis.Shared.Collections.IntervalTree`1+Node[[Microsoft.CodeAnalysis.Formatting.SuppressWrappingData, Microsoft.CodeAnalysis.Workspaces]][]
OLD: Microsoft.CodeAnalysis.Shared.Collections.IntervalTree
NEW: Microsoft.CodeAnalysis.Shared.Collections.IntervalTree
Input: System.Collections.Generic.Dictionary2[[Microsoft.VisualStudio.Threading.WeakKeyDictionary
2+WeakReference`1[[Microsoft.VisualStudio.Threading.IJoinableTaskDependent, Microsoft.VisualStudio.Threading],[System.Int32, mscorlib],[Microsoft.VisualStudio.Threading.IJoinableTaskDependent, Microsoft.VisualStudio.Threading]], Microsoft.VisualStudio.Threading],[System.Int32, mscorlib]]
OLD: System.Collections.Generic.Dictionary<Microsoft.VisualStudio.Threading.WeakKeyDictionary<Microsoft.VisualStudio.Threading.IJoinableTaskDependent, System.Int32, Microsoft.VisualStudio.Threading.IJoinableTaskDependent>, System.Int32>
NEW:
System.Collections.Generic.Dictionary<Microsoft.VisualStudio.Threading.WeakKeyDictionary<Microsoft.VisualStudio.Threading.IJoinableTaskDependent, System.Int32>+WeakReference
Input: Microsoft.CodeAnalysis.PooledObjects.ObjectPool1[[System.Collections.Generic.Stack
1[[System.ValueTuple2[[Microsoft.CodeAnalysis.Shared.Collections.IntervalTree
1+Node[[Microsoft.CodeAnalysis.Editor.Shared.Tagging.TagSpanIntervalTree`1+TagNode[[Microsoft.VisualStudio.Text.Tagging.IErrorTag, Microsoft.VisualStudio.Text.UI]], Microsoft.CodeAnalysis.EditorFeatures]], Microsoft.CodeAnalysis.Workspaces],[System.Boolean, mscorlib]], mscorlib]], System]]
OLD: Microsoft.CodeAnalysis.PooledObjects.ObjectPool<System.Collections.Generic.Stack<System.ValueTuple<Microsoft.CodeAnalysis.Shared.Collections.IntervalTree<Microsoft.CodeAnalysis.Editor.Shared.Tagging.TagSpanIntervalTree
NEW: Microsoft.CodeAnalysis.PooledObjects.ObjectPool<System.Collections.Generic.Stack<System.ValueTuple<Microsoft.CodeAnalysis.Shared.Collections.IntervalTree<Microsoft.CodeAnalysis.Editor.Shared.Tagging.TagSpanIntervalTree
Looks great, ship it!
Just ran into this issue; I bet this fix will solve it for me, thank you @ryanmolden for working on it!
Let me describe what I'm seeing in case anyone in the future runs into the problem in the same way.
In my case, I was iterating runtime.Heap.EnumerateObjects()
, and if objects had the type name System.Collections.Concurrent.ConcurrentDictionary<MyTypeA, MyTypeB>
, I'd try to print all the "MyTypeA" key objects. To do this, I'd dive into the internals of ConcurrentDictionary like so:
// type of _tables is "Tables", a nested class
ClrObject tables = clrObject.ReadObjectField("_tables");
// type of _buckets is "Node?[]", and Node is a nested class
ClrObject buckets = tables.ReadObjectField("_buckets");
ClrArray bucketsAsArray = buckets.AsArray();
for (int i = 0; i < bucketsAsArray.Length; i++)
{
ClrObject node = bucketsAsArray.GetObjectValue(i);
if (!node.IsNull)
{
// print MyTypeA, which for me is at node.ReadValueTypeField("_key")
}
}
But I was getting an ArgumentException that _tables was not a field of ConcurrentDictionary. Huh?? So I printed out the ClrObject.Fields
, and it told me this:
_value, _next, _hashcode, _key
or occasionally this:
_buckets, _locks, _countPerLock, _fastModBucketsMultiplier
These are actually the fields of the nested classes Node and Tables, respectively. (It would also tell me the correct fields sometimes).
This was run on a program with a large heap and lots of ConcurrentDictionaries. I tried to repro this behavior in a small program and here's what I found:
Constructor names are not broken, so I can differentiate between ConcurrentDictionary and Node by checking the methods. For example, the here are the two constructor signatures:
// ConcurrentDictionary
"System.Collections.Concurrent.ConcurrentDictionary`2[[ObjectSizePlayground.ThreeIntStruct, ObjectSizePlayground],[System.__Canon, System.Private.CoreLib]]..ctor()"
// Node
"System.Collections.Concurrent.ConcurrentDictionary`2+Tables[[MyAssembly.MyTypeA, MyAssembly],[System.__Canon, System.Private.CoreLib]]..ctor(Node<MyAssembly.MyTypeA,System.__Canon>[], System.Object[], Int32[])"
Or you can just put a try/catch around ReadObjectField.
@erikness Does the problem still reproduce on the newest version of ClrMD with this issue fixed?
Oops, didn't realize you had already pushed out a fix. Just tested it, looks like it works!
Output of last WriteLine is true that all three distinct ClrType (distinct by method table) name's match, but looking at the same method tables in SOS using !dumpmt they do not match.
I will send you an email with the dump location for the dump used in the repro above.