[API Proposal]: WrappedValueAttribute describing a type that wraps another value

jarle-moe commented 5 months ago

Background and motivation

Often it's useful to design datatypes that acts as a wrapper for an enclosed value

Strongly typed identifiers for instance is not directly supported in .NET (asterisk) and not because the languages are not capable, but because there is no way for tools to see that one type primarily acts as a wrapper for a different one

In the case of implicit coercion, there's no way to tell the IDE that it should suggest the enclosed value rather than the declaring type

This prevents programmers from writing simple types representing a concrete wrapped value, which causes problems, duplicate code or boilerplate for the following :

Mapping between values and objects (property type UserId to Guid for instance)
IntelliSense will suggest the declaring type rather than the wrapped type even if there's an implicit cast operator
Makes it difficult to use strongly typed identifiers where it's valuable (EntityFramework)
Serialization requires extra code often a single line that just defers serialization to the enclosed type

IDE's seem to respect the enclosing type of Lazy<T> and Nullable<T> however this does not seem to be supported by a generally available mechanism

Considering the obviousness of this suggestion, I'll assume that it has already been discussed somewhere, however I could not find any issues or public conversations regarding it. Under fear of being called oblivious, I'm giving it a shot

API Proposal

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct, AllowMultiple = false, Inherited = true)]
public class WrappedValueAttribute : Attribute
{
    public WrappedValueAttribute(string propertyName)
    {            
        PropertyName = propertyName;            
    }        
    public string PropertyName { get; }
}

The attribute describes a single property that informs supporting tools which property it should use to describe the wrapped type. From this they can infer suggestions for design-time, as well as applicable implementations for serialization, mapping and navigation without requiring programmers to write code for every type

For instance, if the JSON serializer can take a look at this attribute, it can see that what it should do is serialize and deserialize the property it points to, and not the enclosing type, without requiring explicit implementation from the programmer

EntityFramework could also utilize this to do automatic mapping between a concrete supported type and a custom type

API Usage

Example 1: strong identifiers

[WrappedValue(nameof(Id))]
public readonly record struct UserId(Guid Id) : IEquatable<UserId>
{
}

In this instance, it would allow tools like EntityFramework and Mapster/AutoMapper to recognize this as a strongly typed identifier, and automatically figure that it should map it to and from Guid by inspecting the target property and constructor

Similarly applies to serializers

Example 2: Class acting as a wrapper, implicit coercion is the API designers intent rather than a constructor

[WrappedValue(nameof(Value))]
public class SomeData<T>
{
    private SomeData(T value) { Value = value; }
    public T Value { get; }
    public Lazy<T> StoredValue { get; } = new Lazy(() => ..);
    public bool IsChanged() => StoredValue.Value.Equals(Value);

    public static explicit operator T(SomeData<T> v) => v.Value;
    public static implicit operator SomeData<T>(T v) => new SomeData(v);
}

In example 2; let's say I have this class :

public class MyClass
{
    public SomeData<SomeWrappedType> MyProp { get; set; }
}

When I write new MyClass { MyProp = IntelliSense should suggest SomeWrappedType and not SomeData<SomeWrappedType> since SomeData<T> is described as a wrapper, and provides an implicit cast operator for the enclosed type

Alternative Designs

An alternative approach is IWrappedValue<T> however this may forcefully guide the implementation of other's code, as well as prevent type variance

public interface IWrappedValue<T>
{
    T Value { get; set; }
}

Or

public interface IWrappedValue<T, TSelf> where TSelf : IWrappedValue<T, TSelf>
{
    public static abstract explicit operator T(TSelf self);
    public static abstract implicit operator TSelf(T v);
}

Risks

Third-party support may vary, and consumers may expect it to work regardless of use-case
Introduces additional work for analyzers which may affect design-time performance, compilation time and complexity of supporting tools
May introduce a performance penalty in certain scenarios (LINQ) since there is a level of indirection
May not be applicable for all use-cases, for instance will not support transform

svick commented 5 months ago

If I understand it correctly, this attribute wouldn't be useful by itself, but only if other parts of the ecosystem understood it.

You alluded to some examples, but could you make it clearer what the tools and libraries that you expect to understand this attribute are and how would they handle it? Also, are those tools and libraries on board with making these changes?

jarle-moe commented 5 months ago

You alluded to some examples, but could you make it clearer what the tools and libraries that you expect to understand this attribute are and how would they handle it?

Certainly

So take JsonSerializer for example. In the cases I outlined by default it will output this :

{
    "userId" : { "Id" : "1234" }
}

While for something like a strongly typed identifier, what you actually want is this :

{
    "userId" : "1234"
}

In order to do this now, a custom JsonConverter is needed which basically is just this one line in override for Write : JsonSerializer.Serialize(writer, value.Id, options);

Another case is Mapster and AutoMapper. Currently, you need to write two lines per type; MapWith<MyType, long>(n => n.Id) and another one in the reverse direction. Otherwise these values will be (silently) ignored

EntityFramework is another example; types need to be the database-supported native types - but there are plenty of discussions around on strongly typed identifiers, so I think it would do well there as well.

For the applicable tools; they'd find the attribute, read the property name, get metadata for the property and do reflection, emit, code-generation, analysis or whatever to figure out how to treat it. For a serializer, serialize only the target property. For a mapper, copy only the value for the target property. For an ORM, use the target property as the value.

Same goes for IDE's and LSP's; as mentioned I've noticed that Visual Studio and Rider both pay special attention to the type parameter for Lazy<T> and Nullable<T>, including understanding the implicit coercion, but trying myself to get them to do the same for my defined types nothing happens leading me to believe there's no general approach for this

Hope this makes it a bit clearer

Also, are those tools and libraries on board with making these changes?

No, this is just my personal projects. I kept running into these situations several times, found no generalized solution for it and thought maybe I'd give a shot at making a proposal. But I should've, they obviously know quite a bit more about this than me in hindsight. Sorry, first time giving this a go

jarle-moe commented 5 months ago

I can try to reach out to some tomorrow and see if they think it's a good idea and find out to which degree it actually solves the problem I thought it would

jarle-moe commented 5 months ago

In retrospect I realize my approach was a bit naive here.. Gathering some more input around. I've reached out to a few third-party library implementers as well to see if they think it's a good idea, or a horrifically terrible idea never to be brought up again lest Cthulhu... well you know how it goes

@StephenMolloy @HongGit @jeffhandley @dotnet/area-system-text-json ?

GeirGrusom commented 5 months ago

You alluded to some examples, but could you make it clearer what the tools and libraries that you expect to understand this attribute are and how would they handle it? Also, are those tools and libraries on board with making these changes?

Consider this class. It's supposed to be stored in EntityFramework using a REST API and additionally with GraphQL.

class User(int Id, int GroupId, int AddressId);

The obvious bad thing about this is that you can easily mistake GroupId and AddressId, and the error might even not cause a constraint failure in the database. A solution to this is to declare each of these types to enforce strong typing:

class User(UserId id, GroupId group, AddressId id);

This solves the problem because you can no longer assign the wrong value because it would be a compile time error.

Basically each of these types are a single value (for example an int) contained inside a readonly struct.

However when you go to store this in the database it will fail at runtime. EntityFramework does not know how to handle these types. The intention however is that these types are constraints for a different type. When it comes to serialization, or storage, they should be handled exactly like the type inside it, but EntityFramework doesn't know, so you have set up conversion functions or types for them. This is nothing more than tedious boilerplate code. Forgetting to do so is a runtime error, so depending on your test regime you might not pick up the error immediately.

One way to fix this in EF:

modelBuilder.Entity<User>(x => 
{
 x.Property(x => x.Id).HasConversion(x => (int)x, x => (UserId)x)
 .Property(x => x.Group).HasConversion(x => (int)x, x => (GroupId)x)
 .Property(x => x.Address).HasConversion(x => (int)x, x => (AddressId)x);
}

Then you get a solution that you're happy with after writing a bunch of boilerplate EntityFramework code, but when it comes to your REST API you're stuck again. System.Text.Json doesn't know how to serialize this type. But the solution is the same as in EF: more boilerplate converters, and runtime errors until it's fixed for every type.

Then you notice that your GraphQL is broken in the exact same way. Even more of almost the exact same boilerplate again.

The solution for all these components are essentially the same: write a converter that simply calls the cast operators.

I think this solution could work, though it requires a collaboration. Ideally I think that this would best be solved by the language rather than the runtime. What I basically want is a compile time difference between the same underlying type and the domain they're used for, but making an obvious way for serializers to recognize the underlying type would work.

GeirGrusom commented 5 months ago

I think I prefer the interface solution with operators over the attribute solution since the internal field type can remain an implementation detail.

dotnet-policy-service[bot] commented 5 months ago

Tagging subscribers to this area: @dotnet/area-system-runtime See info in area-owners.md if you want to be subscribed.

jarle-moe commented 5 months ago

After a bit of tinkering, I'm thinking that the IWrapper<T, TSelf> implementation is overall a better and simpler solution The caveats are that it can't reasonably support ref struct type arguments and it forces invariance of the type parameter, and I supposed it can't be async, but I don't think those concerns outweigh the benefits especially for the intended use-case

lostmsu commented 4 months ago

I don't like the interface approach, because it does not make a clear distinction between read-only (can only be set in constructor), read-write, and init-only properties.

For attribute approach the parameter can be optional if there's only one public property.

My primary use case is JSON serialization, but I agree that it would be useful in EF as well.

GeirGrusom commented 4 months ago

I have a similar project to yours, https://github.com/GeirGrusom/IdGenerator, which is why I'm also interested in this :)

I don't like the interface approach, because it does not make a clear distinction between read-only (can only be set in constructor), read-write, and init-only properties.

I'm not sure why it should matter here? For your library you would just have to implement explicit cast operators. Why is whether it uses constructors, init properties or whatever else relevant?

dotnet / runtime