Open teo-tsirpanis opened 1 year ago
Tagging subscribers to this area: @dotnet/area-system-reflection-metadata See info in area-owners.md if you want to be subscribed.
Author: | teo-tsirpanis |
---|---|
Assignees: | - |
Labels: | `api-suggestion`, `area-System.Reflection.Metadata` |
Milestone: | - |
I propose additional span overloads for methods that work with memory buffers
@teo-tsirpanis can you provider an example of the caller using buffers (not backed by a simple array) where these APIs are advantageous? Thanks
One use case would be https://github.com/dotnet/runtime/pull/84580#discussion_r1169306517.
@teo-tsirpanis the API proposals for BlobBuilder, BlobContentId, BlobWriter and MetadataBuilder looks good to me, for others more info needed, I left some questions as a comment. In general, a usage scenarios for them would be helpful for review. Also, I would prefer to split the proposals for BlobReader
and MetadataReader
into another issue if that makes sense to you
public struct BlobReader
{
public byte[] ReadBytes(int byteCount)
public void ReadBytes(int byteCount, byte[] buffer, int bufferOffset)
+ public void ReadBytes(Span<byte> buffer);
public string ReadUTF8(int byteCount)
+ public void ReadUTF8(int byteCount, Span<char> buffer); // Why this passes byteCount but ReadBytes(Span<byte> buffer) not?
+ public bool TryReadUTF8(int byteCount, Span<char> buffer, out int charsWritten); // A uage scenarios will be helpful
public string ReadUTF16(int byteCount)
+ public void ReadUTF16(int byteCount, Span<char> buffer);
+ public bool TryReadUTF16(int byteCount, Span<char> buffer, out int charsWritten); // Usage scenarios will be helpful
}
public struct MetadataReader
{
+ public int GetBlobLength(BlobHandle handle); // Why this needed? I guess for determining span size, anyway a user scenario will be useful
public byte[] GetBlobBytes(BlobHandle handle);
+ public void GetBlobBytes(BlobHandle handle, Span<byte> buffer, out int bytesWritten); // why do you need bytesWritten?
public string GetString(StringHandle handle);
+ public int GetStringLength(StringHandle handle); // usage scenarios
+ public void GetStringBytes(StringHandle handle, Span<char> buffer, out int bytesWritten);
+ public int GetStringLength(NamespaceDefinitionHandle handle);
+ public void GetStringBytes(NamespaceDefinitionHandle handle, Span<char> buffer, out int bytesWritten);
+ public int GetStringLength(DocumentNameBlobHandle handle);
+ public void GetStringBytes(DocumentNameBlobHandle handle, Span<char> buffer, out int bytesWritten);
+ public int GetUserStringLength(UserStringHandle handle);
+ public void GetUserStringBytes(UserStringHandle handle, Span<char> buffer, out int bytesWritten);
// Or what about we add the following APIs instead of the above?
// The memory the span points to can be freed under our feet but
// this is an existing problem with many SRM APIs that safely wrap pointers.
// The library is expert-level either way.
+ public ReadOnlySpan<byte> GetBlobSpan(BlobHandle handle);
+ public ReadOnlySpan<byte> GetRawStringBytes(StringHandle handle);
+ public ReadOnlySpan<byte> GetRawStringBytes(NamespaceDefinitionHandle handle);
+ public ReadOnlySpan<byte> GetRawStringBytes(DocumentNameBlobHandle handle);
+ public ReadOnlySpan<byte> GetRawUserStringBytes(UserStringHandle handle);
}
public sealed class MetadataBuilder
{
public BlobHandle GetOrAddBlob(byte[] value)
// These APIs will not allocate if the blob/string already exists.
+ public BlobHandle GetOrAddBlob(ReadOnlySpan<byte> value);
public BlobHandle GetOrAddBlobUTF8(string value, bool allowUnpairedSurrogates = true);
+ public BlobHandle GetOrAddBlobUTF8(ReadOnlySpan<byte> value); // should it have allowUnpairedSurrogates parameter too?
public System.Reflection.Metadata.BlobHandle GetOrAddBlobUTF16(string value);
+ public BlobHandle GetOrAddBlobUTF16(ReadOnlySpan<byte> value);
public BlobHandle GetOrAddDocumentName(string value);
+ public BlobHandle GetOrAddDocumentName(ReadOnlySpan<char> value);
public StringHandle GetOrAddString(string value);
+ public StringHandle GetOrAddString(ReadOnlySpan<char> value);
public UserStringHandle GetOrAddUserString(string value);
+ public UserStringHandle GetOrAddUserString(ReadOnlySpan<char> value);
}
Adding future milestone for now, I don't think all of these APIs ready for 8.0, we could change milestone if BlobReader
and MetadataReader
proposals moved to an another issue/proposal
Missing the simplest one :)
public sealed class MetadataReader
{
public unsafe byte* MetadataPointer { get; }
public int MetadataLength { get; }
+ public ReadOnlySpan<byte> MetadataSpan { get; }
}
Not sure if this is that useful given the whole type is unsafe, but perhaps:
public unsafe struct BlobReader
{
+ public BlobReader(ReadOnlySpan<byte> buffer);
}
ILCompiler and its components could benefit from more (RO)Span<T>
overloads in the S.R.M. In many places ILC has to go through various hoops to cache values that it reads from metadata because lack of modern Span APIs.
@buyaa-n
public int GetBlobLength(BlobHandle handle); // Why this needed? I guess for determining span size, anyway a user scenario will be useful
My thought was that it would allow the users to get the buffer's size and prepare an appropriately sized buffer to pass to GetBlob
.
public void ReadUTF8(int byteCount, Span<char> buffer); // Why this passes byteCount but ReadBytes(Span<byte> buffer) not?
public bool TryReadUTF8(int byteCount, Span<char> buffer, out int charsWritten); // A uage scenarios will be helpful
On second thought byteCount
is not needed; it can be inferred from the span. I updated the proposal to fix this and other mistakes. I also am not sure if the BlobReader.TryRead***
methods are needed.
public void GetStringChars(NamespaceDefinitionHandle handle, Span<char> buffer, out int bytesWritten);
The out parameter should be named charsWritten
instead. For this, and all other char-based spans with out length parameters.
Updated, thanks.
IMO I prefer the single TryGetString
method pattern I proposed in #103169 over two methods (GetStringLength
and GetStringChars
). Besides being one method instead of two, it means I can reuse a buffer for many get-string calls without fear of it throwing an exception just because the length is too short. With the two method pattern, I must make an extra GetStringLength
call before trying to obtain the string just to avoid the exception.
Background and motivation
System.Reflection.Metadata
is a pretty perormance-oriented library but its API lacks methods that accept spans. I propose additional span overloads for methods that work with memory buffers. Some of these APIs can now be implemented efficiently thanks to the work started in #81059.API Proposal
API Usage
The proposed APIs correspond to existing APIs that work with strings and (immutable) byte arrays. Their usage will be similar.
Alternative Designs
Do nothing and either let users implement the span APIs on top of pointers if they are available, or accept the memory allocations if they aren't.
There are also more possible APIs to be spanified (such as methods in the
System.Reflection.Metadata.Ecma335.***Encoder
s), but the proposed APIs are the most foundational; the rest can be implemented by user code on top of these with little effort.Risks
No response