dotnet / aspnetcore

ASP.NET Core is a cross-platform .NET framework for building modern cloud-based web applications on Windows, Mac, or Linux.
https://asp.net
MIT License
35.44k stars 10.01k forks source link

JsonSerializer byte[] to stream nested base64 serialization #49868

Open maciekDXC opened 1 year ago

maciekDXC commented 1 year ago

Description

I have a MVC app in which I am using action filters - I have observed that the byte[] result of a controller method is serialized twice (base64 inside base64) using JsonSerializer overload (introduced in .net 6.0) which serializes byte[] and passes it to a stream. Please see a short NUnit test example below (I am using non-async method in my unit tests while MVC is using async one, but this issue seem to be applicable to all overloads of JsonSerializer which handle serialization from byte[] to stream) - the unit tests are replicable on unit test projects using .net 6.0+ (including 8.0 preview).

Reproduction Steps

using Newtonsoft.Json;
using System;
using System.IO;
using System.Text.Json;
using NUnit.Framework;

namespace Tests.Tests
{
    public class Tests
    {

        [Test]
        public void Confirm_no_double_serializing_NewtonsoftJson()
        {
            var input = "dGVzdA==";
            byte[] byteObj = Convert.FromBase64String(input);

            var output = JsonConvert.SerializeObject(byteObj);

            var areEqual = output.Contains(input);
            Assert.IsTrue(areEqual);
        }

        [Test]
        public void Confirm_no_double_serializing_JsonSerializer_non_stream()
        {
            var input = "dGVzdA==";
            byte[] byteObj = Convert.FromBase64String(input);

            var output = System.Text.Json.JsonSerializer.Serialize(byteObj, typeof(byte[]));

            var areEqual = output.Contains(input);
            Assert.IsTrue(areEqual);
        }

        [Test]
        public void Confirm_no_double_serializing_JsonSerializer_stream()
        {
            var input = "dGVzdA==";
            byte[] byteObj = Convert.FromBase64String(input);

            MemoryStream stream = new MemoryStream();
            JsonSerializerOptions SerializerOptions = new JsonSerializerOptions();
            System.Text.Json.JsonSerializer.Serialize(stream, byteObj, SerializerOptions);
            var output = Convert.ToBase64String(stream.ToArray());

            var areEqual = output.Contains(input);
            Assert.IsTrue(areEqual);
        }
    }
}

Expected behavior

Input is a base64 string "dGVzdA==" converted to byte[] which is then serialized Deserialized output of above serialization should also be the same base64 string ("dGVzdA==")

Actual behavior

Input is a base64 string "dGVzdA==" converted to byte[] which is then serialized Deserialized output of above serialization is "ImRHVnpkQT09Ig==" (which is an encoded form of our input ("dGVzdA=="))

Regression?

The issue does not exist in .NET 5.0 and below as this version does not contain JsonSerializer overloads for serializing and copying to a stream.

Known Workarounds

No response

Configuration

Which version of .NET is the code running on? 8.0 SDK but is applicable to all .NET 6.0+ What OS and version, and what distro if applicable? Windows 10 What is the architecture (x64, x86, ARM, ARM64)? Windows Laptop - x64 Do you know whether it is specific to that configuration? It is specific to .NET 6.0+ frameworks

Other information

Nunit tests have been provided for an easy reproduction.

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis See info in area-owners.md if you want to be subscribed.

Issue Details
### Description I have a MVC app in which I am using action filters - I have observed that the byte[] result of a controller method is serialized twice (base64 inside base64) using JsonSerializer overload (introduced in .net 6.0) which serializes byte[] and passes it to a stream. Please see a short NUnit test example below (I am using non-async method in my unit tests while MVC is using async one, but this issue seem to be applicable to all overloads of JsonSerializer which handle serialization from byte[] to stream) - the unit tests are replicable on unit test projects using .net 6.0+ (including 8.0 preview). - Confirm_no_double_serializing_NewtonsoftJson - pass - example how NewtonsoftJson serializes byte[] data - the output is the same base64 string as input - Confirm_no_double_serializing_JsonSerializer_non_stream - pass - example how JsonSerializer serializes byte[] data - the output is the same base64 string as input - Confirm_no_double_serializing_JsonSerializer_stream - fail- example how JsonSerializer serializes byte[] data when asked to copy it directly to a stream rather than returning raw serialization output (this version is used by MVC) - the output is a base64 string encoded twice - it matches input only after double decoding ### Reproduction Steps using Newtonsoft.Json; using System; using System.IO; using System.Text.Json; using NUnit.Framework; namespace Tests.Tests { public class Tests { [Test] public void Confirm_no_double_serializing_NewtonsoftJson() { var input = "dGVzdA=="; byte[] byteObj = Convert.FromBase64String(input); var output = JsonConvert.SerializeObject(byteObj); var areEqual = output.Contains(input); Assert.IsTrue(areEqual); } [Test] public void Confirm_no_double_serializing_JsonSerializer_non_stream() { var input = "dGVzdA=="; byte[] byteObj = Convert.FromBase64String(input); var output = System.Text.Json.JsonSerializer.Serialize(byteObj, typeof(byte[])); var areEqual = output.Contains(input); Assert.IsTrue(areEqual); } [Test] public void Confirm_no_double_serializing_JsonSerializer_stream() { var input = "dGVzdA=="; byte[] byteObj = Convert.FromBase64String(input); MemoryStream stream = new MemoryStream(); JsonSerializerOptions SerializerOptions = new JsonSerializerOptions(); System.Text.Json.JsonSerializer.Serialize(stream, byteObj, SerializerOptions); var output = Convert.ToBase64String(stream.ToArray()); var areEqual = output.Contains(input); Assert.IsTrue(areEqual); } } } ### Expected behavior Input is a base64 string "dGVzdA==" converted to byte[] which is then serialized Deserialized output of above serialization should also be the same base64 string ("dGVzdA==") ### Actual behavior Input is a base64 string "dGVzdA==" converted to byte[] which is then serialized Deserialized output of above serialization is "ImRHVnpkQT09Ig==" (which is an encoded form of our input ("dGVzdA==")) ### Regression? The issue does not exist in .NET 5.0 and below as this version does not contain JsonSerializer overloads for serializing and copying to a stream. ### Known Workarounds _No response_ ### Configuration Which version of .NET is the code running on? 8.0 SDK but is applicable to all .NET 6.0+ What OS and version, and what distro if applicable? Windows 10 What is the architecture (x64, x86, ARM, ARM64)? Windows Laptop - x64 Do you know whether it is specific to that configuration? It is specific to .NET 6.0+ frameworks ### Other information Nunit tests have been provided for an easy reproduction.
Author: maciekDXC
Assignees: -
Labels: `area-System.Text.Json`
Milestone: -
eiriktsarpalis commented 1 year ago

Actual behavior

Deserialized output of above serialization is "ImRHVnpkQT09Ig==" (which is an encoded form of our input ("dGVzdA=="))

I'm not sure why this is unexpected, since your reproduction is literally creating a Base64 string of the UTF-8 JSON encoding in this line:

var output = Convert.ToBase64String(stream.ToArray());

I'm guessing you wanted to do something like this instead?

var output = Encoding.UTF8.GetString(stream.ToArray());

ghost commented 1 year ago

This issue has been marked needs-author-action and may be missing some important information.

maciekDXC commented 1 year ago

@eiriktsarpalis apologies for the confusion - I think I rushed my bug report a bit. I spend more effort on this - I still think there is a defect somewhere around serialization of byte array, I am now more leaning towards this being a MVC defect rather than JsonSerializer (i.e. MVC is using JsonSerializer to move byte[] action result to response stream and this is the root cause I think), but I still need to wrap my head around it. I have created a better unit test method which better describes the actual problem statement that I observe. Below, MVCControllerMethodResult variable is a byte array - it simulates a byte[] result from my MVC controller method. Then, MVC internally is using JsonSerializer method which accepts byte[], serializes and moves it to output stream - in my method below I have broken this operation into two separate steps (serialize, then move to stream) for easier debugging. When we read from that response stream (in my real application, I try to capture controller result from a response stream in my ResultActionFilter method) - we can see that binary data in the stream differs from what we originally used as an input. As I said originally, I am still trying to wrap my head around this because for some reason browsers handle this without issues - so it may be that I am missing something here, but a fact is that my MVC controller returns byte[] with certain values, yet in result action filter, response stream byte array is different - this itself proves there is a defect somewhere here, I think...

[Test]
public void MVCResult()
{
    byte[] MVCControllerMethodResult = Convert.FromBase64String("dGVzdA==");
    var serialized = System.Text.Json.JsonSerializer.Serialize(MVCControllerMethodResult, typeof(byte[]));
    MemoryStream MVCOutputBody = new MemoryStream();
    var writer = new StreamWriter(MVCOutputBody);
    writer.Write(serialized);
    writer.Flush();
    MVCOutputBody.Position = 0;
    var output = MVCOutputBody.ToArray();
    Assert.IsTrue(MVCControllerMethodResult == output);
}
eiriktsarpalis commented 1 year ago

Can you share a minimal reproducing MVC app?

maciekDXC commented 1 year ago

@eiriktsarpalis below is a minimal reproducing MVC app. To reproduce it:

  1. In VS create new project -> "ASP .NET Core Empty" (.net 7.0)
  2. Override Program.cs with my first code block
  3. Create a controller class, paste my second code block

Description of behavior: After running the app, you can open two endpoints: http://localhost:5148/bytearray/testbrowser returns the byte[] directly to browser without using action filters. http://localhost:5148/bytearray/testactionfilter will throw an exception as it's the same method as testbrowser, difference is that in action filter I check if the response stream equals to return value of controller method, if not I throw exception.

Description of codebase: ByteTestController contains two streams

ByteArrayTestAttribute contains two methods:

My 2 cents on this: The condition if (snapshot != BYTE_RESULT_TEST.Bytes) definitely proves that, at least at that point in time of execution, the response body is not the same as our result of controllers' method execution - in my opinion, this is wrong. That said, as the browser is still somehow handling this properly, I think I may be missing something in here - i.e. this may as well turn out to be my misunderstanding of how things work.

using ByteArrTest.Controllers;

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddControllers();
builder.Services.AddScoped<ByteArrayTestAttribute>();
var app = builder.Build();

app.MapControllers();

app.Run();
using Microsoft.AspNetCore.Mvc;
using Microsoft.AspNetCore.Mvc.Filters;

namespace ByteArrTest.Controllers
{
    [ApiController]
    [Route("[controller]/[action]")]
    public class ByteArray : ByteTestController
    {
        public ByteArray() { }

        [HttpGet]
        public byte[] testBrowser()
        {
            return BYTE_RESULT_TEST.Bytes;
        }

        [HttpGet]
        [ServiceFilter(typeof(ByteArrayTestAttribute))]
        public byte[] testActionFilter()
        {
            return BYTE_RESULT_TEST.Bytes;
        }
    }

    public static class BYTE_RESULT_TEST
    {
        public static byte[] Bytes = Convert.FromBase64String("dGVzdA==");

    }

    public class ByteTestController : ControllerBase
    {
        public MemoryStream? OverriddenStream { get; set; }
        public Stream? OriginalStream { get; set; }
    }

    public class ByteArrayTestAttribute : ActionFilterAttribute, IAsyncActionFilter
    {
        public ByteArrayTestAttribute()
        {
        }

        public override async Task OnResultExecutionAsync(ResultExecutingContext context, ResultExecutionDelegate next)
        {
            OnResultExecuting(context);
            ResultExecutedContext resultExecutedContext = await next();
            var byteTestController = context.Controller as ByteTestController;
            var snapshot = byteTestController.OverriddenStream.ToArray();
            if (snapshot != BYTE_RESULT_TEST.Bytes)
            {
                throw new Exception("Byte input and output are different!");
            }

            byteTestController.HttpContext.Response.Body = byteTestController.OriginalStream;
            if (byteTestController.HttpContext.Response.Body != null)
            {
                await byteTestController.HttpContext.Response.Body.WriteAsync(snapshot, 0, snapshot.Length);
            }

            OnResultExecuted(resultExecutedContext);
        }

        public override async Task OnActionExecutionAsync(ActionExecutingContext context, ActionExecutionDelegate next)
        {
            var byteTestController = context.Controller as ByteTestController;
            byteTestController.OverriddenStream = new MemoryStream();
            byteTestController.OriginalStream = context.HttpContext.Response.Body;
            context.HttpContext.Response.Body = byteTestController.OverriddenStream;
            await next();
        }
    }
}
eiriktsarpalis commented 1 year ago

Transfering to Aspnet for additional triaging.

feiyun0112 commented 1 year ago
if (System.Text.Encoding.UTF8.GetString(snapshot) != System.Text.Json.JsonSerializer.Serialize(BYTE_RESULT_TEST.Bytes))
            {
                throw new Exception("Byte input and output are different!");
            }