imperugo / StackExchange.Redis.Extensions

MIT License
612 stars 178 forks source link

SystemTextJsonSerializer behavior changed in 8.0 and results in data loss after upgrade #540

Closed vaidotas-ars closed 11 months ago

vaidotas-ars commented 1 year ago

The Serialize function of SystemTextJsonSerializer in commit 31141f9 changed the way it gets type of passed data. Which changed the way some data is serialized. Let's say we have an interface with class:

interface InterfaceA<KeyType>
{
    KeyType Key { get; }
}
class A : InterfaceA<string>
{
  public string Key { get; set; }
  public string Data  { get; set; }
}
// Usage
class ACacheService
{
    public byte[] SerializeWithKey<KeyType>(InterfaceA<KeyType> data)
    {
        SystemTextJsonSerializer serializer = new();
        return serializer.Serialize(data);
    }
    public byte[] SerializeAny<T>(T data)
    {
        SystemTextJsonSerializer serializer = new();
        return serializer.Serialize(data);
    }
    public void TestCache()
    {
        A data = new A { Key="AB", Data="DC123" };
        var withKey = SerializeWithKey(data);
        var any = SerializeAny(data);
    }
}

Before above mentioned change SerializeWithKey and SerializeAny output was the same: {"Key":"AB", "Data":"DC123"}. After the commit output is different:

This regresion results in data loss as now it serializes only data of base class or interface, but not whole object instance data.

The documentation about this change impact is missing.

LeaFrock commented 1 year ago

I agree that it's a break change.

Before,

public byte[] Serialize<T>(T? item)
    {
        if (item == null)
            return Array.Empty<byte>();

        var type = item.GetType();
        return JsonSerializer.SerializeToUtf8Bytes(item, type, Options(type));
    }

Now,

 public byte[] Serialize<T>(T? item)
    {
        return item == null
            ? Array.Empty<byte>()
            : JsonSerializer.SerializeToUtf8Bytes<T>(item, Options(typeof(T)));
    }

GetType() will always return the actual type of an object, but typeof(T) won't.

imperugo commented 1 year ago

Hi, that's interesting poi, let me take a look better to the code. The point now, if I rollback the .GetType vs typeof(T) is that I'm going to introduce another breaking change for all people who are using without problems

LeaFrock commented 1 year ago

Serialize<Derived> will serialize the whole properties of an actual object, while Serialize<BaseOrInterface> will serialize the partial properties only in the base class or interface. Therefore, whether people are affected depends on whether people input an abstract/base object for serializing.

introduce another breaking change for all people who are using without problems

The people using without problems shall have a very low risk after rolling back (they may just find that the serialization outputs more properties). Even though they might have in some specific scenarios, they still have a workaround that mark the unexpected properties as JsonIgnore.

And IMO, not only should you roll back the codes, but also

imperugo commented 1 year ago

Hi @LeaFrock I'm not convinced deprecate / unlist the current version. The v8.x has been downloaded by almost 700 people and no-one opened an issue about that before of you. I'm not saying that I don't want to fix that, I'm just saying that probably unlist the package is too much.

About the fix could u open a pr so I can merge and deploy it very soon?

vaidotas-ars commented 1 year ago

Hi @imperugo,

My take was download the 8.x check it fails my tests rollback to previous and wait for a fix, after raising this issue. I agree with @LeaFrock, that if someone does not have proper testing in place upgrade would result in loss of data what is critical bug and packages with this does not deserve to be listed.

LeaFrock commented 1 year ago

I've made a min repro.

    public async Task<bool> SchoolEquals()
    {
        var student = new Student()
        {
            Id = 1,
            Name = "Peter",
            School = "Yale"
        };
        var cacheKey = $"User:{student.Id}";
       // For most scenarios, we use `AddAsync` directly without pointing out the type as `User`.
       // But it's still possible since developers may do some encapsulation with class-base/interface over `AddAsync`.
        await _redisDb.AddAsync<User>(cacheKey, student, TimeSpan.FromSeconds(10));
        var student2 = await _redisDb.GetAsync<Student>(cacheKey);
        return student.School == student2!.School;
    }

    public class User
    {
        public int Id { get; set; }

        public string Name { get; set; } = string.Empty;
    }

    public class Student : User
    {
        public string School { get; set; } = string.Empty;
    }
LeaFrock commented 1 year ago

About the fix could u open a pr so I can merge and deploy it very soon?

Hi @imperugo , I fork your repo and try to make a PR.

The key point is this line in SystemTextJsonSerializer,

    /// <inheritdoc/>
    public byte[] Serialize<T>(T? item)
    {
        return item == null
            ? Array.Empty<byte>()
-           : JsonSerializer.SerializeToUtf8Bytes<T>(item, Options(typeof(T)));
+           : JsonSerializer.SerializeToUtf8Bytes(item, Options(item.GetType()));
    }

But my problem is that I cannot pass all unit tests whether I do this change or not. For example,

Add_IncrementItemt_To_Sorted_Set_Async throws an exception which says StackExchange.Redis.RedisServerException : WRONGTYPE Operation against a key holding the wrong kind of value.

ListGetFromRight_With_An_Existing_Key_Should_Return_Null_If_List_Is_Empty_Async says System.Text.Json.JsonException : The JSON value could not be converted to StackExchange.Redis.Extensions.Tests.Helpers.TestClass`1[System.String]. Path: $ | LineNumber: 0 | BytePositionInLine: 126..

I've made sure that my redis configuration is all right (I use the db 10 instead of 0, but it shouldn't matter). The fix is easy but the tests bite me 🤣 .

Now we have two ways,

imperugo commented 11 months ago

I just update the library to the latest version od System.Text.Json and not the Options(typeof(T)) is not needed anymore, because in the latest version you can specify more than one JsonSerializerContext into the Serialization Configuration.

It will be available on nuget today or next week.

The code now looke like this

    public byte[] Serialize<T>(T? item)
    {
        return item == null
            ? Array.Empty<byte>()
            : JsonSerializer.SerializeToUtf8Bytes<T>(item, defaultSerializer);
    }

Let me know if it looks good to you