dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.56k stars 4.54k forks source link

System.Text.Unicode.Utf8Utility.TranscodeToUtf16 return value #74114

Closed emceelovin closed 1 year ago

emceelovin commented 1 year ago

Description

I've obviously called this function through reflection by emitting a wrapper DynamicMethod as it's private, but I feel it returns the incorrect OperationStatus if it reads outputCharsRemaining System.Char's when the input buffer is larger than the output destination due to my buffering technique.

from Utf8Utility.Transcoding.cs:


public static OperationStatus TranscodeToUtf16(...)
{
        ...

        if (outputCharsRemaining == 0)
        {
            goto OutputBufferTooSmall;
        }

        ...
}

Shouldn't this be OperationStatus.Done? It read all it was told to read. It's still working beautifully for what I need. This function is blazingly fast, but I have to handle my "done-ness" differently due to this.

Reproduction Steps

Call TranscodeToUtf16 with an input buffer that contains utf8 bytes to be transcoded to utf16, into an array of char objects - Span<char> via SpanAction of string.Create in my case - whose size is actually larger than what's needed to hold all of the utf8 bytes.

Expected behavior

OperationStatus.Done to be returned.

Actual behavior

OperationStatus.DestinationTooSmall is returned instead.

Known Workarounds

I check to see if pOutputBufferRemaining < pOutputBufferEnd to determine if I need to move to the next buffer in my buffer strategy, or if i'm "done." Since my destination will never be too small. string.Create allocates me a full length string object.

dotnet-issue-labeler[bot] commented 1 year ago

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-text-encoding See info in area-owners.md if you want to be subscribed.

Issue Details
### Description I've obviously called this function through reflection by emitting a wrapper DynamicMethod as it's private, but I feel it returns the incorrect OperationStatus if it reads outputCharsRemaining System.Char's when the input buffer is larger than the output destination due to my buffering technique. from Utf8Utility.Transcoding.cs: ```cs public static OperationStatus TranscodeToUtf16(...) { ... if (outputCharsRemaining == 0) { goto OutputBufferTooSmall; } ... } ``` Shouldn't this be OperationStatus.Done? It read all it was told to read. It's still working beautifully for what I need. This function is blazingly fast, but I have to handle my "done-ness" differently due to this. ### Reproduction Steps Call TranscodeToUtf16 with an input buffer that contains utf8 bytes to be transcoded into utf16, into an array of char objects - Span via SpanAction of string.Create in my case - whose size is actually larger than what's needed to hold all of the utf8 bytes. ### Expected behavior OperationStatus.Done to be returned. ### Actual behavior OperationStatus.DestinationTooSmall is returned instead. ### Regression? _No response_ ### Known Workarounds I handle OperationStatus.DestinationTooSmall as "completed" since my destination will never be too small. ### Configuration _No response_ ### Other information _No response_
Author: emceelovin
Assignees: -
Labels: `area-System.Text.Encoding`, `untriaged`
Milestone: -
stephentoub commented 1 year ago

I've obviously called this function through reflection by emitting a wrapper DynamicMethod as it's private

Please don't. Private reflection into core library internals is highly unsupported and can easily break from version to version, patch to patch, etc.

I feel it returns the incorrect OperationStatus if it reads outputCharsRemaining System.Char's when the input buffer is larger than the output destination due to my buffering technique.

It's correct for the design of this implementation detail and how it's used by its internal consumers. The intent is that the method transcode all of the supplied input; if it can't, then it can't return Done because it's not.

emceelovin commented 1 year ago

...