Open jaredpar opened 6 months ago
WriteBytes
should have been public (typo in the proposal)MaxChunkSize
should be a ctor-set field. It got moved to a ctor parameter only for the byte[]-accepting protected ctorint?=null
to int=0
as zero has no alternative meaningBeforeSwap
=> OnLinking
public partial class BlobBuilder
{
protected byte[] Buffer { get; set; }
public int Capacity { get; set; }
protected BlobBuilder(byte[] buffer, int maxChunkSize = 0);
protected virtual void OnLinking(BlobBuilder other);
protected virtual void SetCapacity(int capacity);
public void WriteBytes(ReadOnlySpan<byte> buffer);
}
public partial class DebugDirectoryBuilder
{
public DebugDirectoryBuilder(BlobBuilder blobBuilder);
}
public partial class ManagedPEBuilder
{
protected virtual BlobBuilder CreateBlobBuilder(int minimumSize = 0);
}
public partial class MetadataBuilder
{
+ [EditorBrowsable(EditorBrowsableState.Never)]
public MetadataBuilder(
- int userStringHeapStartOffset = 0,
+ int userStringHeapStartOffset,
- int stringHeapStartOffset = 0,
+ int stringHeapStartOffset,
- int blobHeapStartOffset = 0,
+ int blobHeapStartOffset,
- int guidHeapStartOffset = 0);
+ int guidHeapStartOffset);
+ public MetadataBuilder(
+ int userStringHeapStartOffset = 0,
+ int stringHeapStartOffset = 0,
+ int blobHeapStartOffset = 0,
+ int guidHeapStartOffset = 0,
+ Func<int, BlobBuilder>? createBlobBuilderFunc = null);
}
@jaredpar were you planning on porting your branch to runtime or were you expecting the runtime team to do this?
Moving to v10 as we have reached the "feature complete" milestone.
Background and motivation
The
BlobBuilder
type is a mix between:StringBuilder
BlobBuilder
(with pooling)In its current configuration it doesn't fully achieve either of these goals due the following reasons:
BlobBuilder
has no enforced maximum internal chunk size. Instead during write operations it has a much simpler strategy of use rest of currentBlobBuilder
then allocate a singleBlobBuilder
to hold the rest. That results in lots of LOH allocations during build.System.Reflection.Metadata
has no mechanism for consumers to provide derivedBlobBuilder
instances and instead allocateBlobBuilder
types directly. This subverts attempts by consumers to pool allocations.LinkSuffix / LinkPrefix
APIs can end up silently mixing the types ofBlobBuilder
instances in a chain. That makes advanced caching like pooling array allocations impossible because types with different caching strategies get silently inserted into the chain. When these insertions happen thebyte[]
underlying the instances are swapped.byte[]
allocation which prevents these from being pooled. Only theBlobBuilder
instances can be pooled which means their underlyingbyte[]
is inefficiently managed because it can't be re-used when the containingBlobBuilder
is at rest. This is in contrast toStringBuilder
which leverages theArrayPool<char>
for allocations.byte[]
when aBlobBuilder
instance from a pool is re-used. Can lead to difficult issues like 99244.The below proposed changes are meant to address these problems such that consumers of
System.Reflection.Metadata
can do the following:BlobBuilder
instances used in a emit pass.byte[]
in theBlobBuilder
.BlobBuilder
instances are linked withBlobBuilder
instances of a different type.Using the below changes I've been able to significantly improve the allocation profile of VBCSCompiler. For building a solution the scale of compilers.slnf (~500 compilation events, large, small and medium projects) I've been able to remove ~200MB of LOH for
byte[]
and reduce GC pause time by 1.5%.API Proposal
API Usage
Can see a full implementation of a PooledBlobBuilder. That branch contains the other changes necessary to use this new API.
Alternative Designs
One alternative design is to limit the ability to control the underlying
byte[]
allocation and have consumers focus on poolingBlobBuilders
only. That will provide some benefit but it is inefficient. It means that a large number ofbyte[]
are unused in the pooledBlobBuilder
instances and hence other parts of the program end up allocating them instead.Risks
There are a few risks to consider:
BlobBuilder
,ManagedPEBuilder
, etc ... These changes are careful to ensure that those consumers are not impacted by these changes. The behavior of the existing code only changes when the new hooks are used in derived types.BlobBuilders
are allocated. That would meanLinkSuffix / LinkPrefix
are called with differing types thus limiting potential gains. In my local tests I hookedBeforeSwap
such that it fails when linked with different types. Was able to successfully rebuild Roslyn with these changes so I'm confident these hooks are thorough.BlobBuilder.MaxChunkSize
does significantly increase the number of allocatedBlobBuilder
during emit. That will require changes to pooling strategies if leveraged. However the new APIs give consumers the flexibility to pursue several strategies here.