dotnet / csharplang

The official repo for the design of the C# programming language
11.02k stars 1.01k forks source link

[Issue]: Generic Type Specialization Not Working in C# #8115

Closed AlexeyZBest closed 2 weeks ago

AlexeyZBest commented 2 weeks ago
public class Graph<T_type>
    where T_type : Vertex
{
    public T_type my_vertex;
    public Graph(T_type vertex)
    {
        vertex.New_Copy(out my_vertex);
    }
}

public class Vertex
{
    public void New_Copy(out Vertex vertex)
    {
        vertex = new Vertex();
    }
}
public class Vertex_Pro : Vertex
{
    public void New_Copy(out Vertex_Pro vertex_pro)
    {
        vertex_pro =  new Vertex_Pro();
    }
}

Why not make it so that during the compilation of type specification Graph, the correct function address would be inserted in place of vertex.New_Copy(out my_vertex)?

CyrusNajmabadi commented 2 weeks ago

Closing out as by design.

Why not make it so that during the compilation of type specification Graph, the correct function address would be inserted in place of vertex.New_Copy(out my_vertex)?

Because the body of hte method is analyzed statically at compile type. So your call to New_Copy can only statically match against Vertex.New_Copy.

if you want virtual dispatch here you can do so either by making the method virtual. Or you can use static interface dispatch like so:

public class Graph<TVertex>
    where TVertex : IVertex<TVertex>
{
    public TVertex my_vertex;
    public Graph(TVertex vertex)
    {
        var copy = TVertex.Copy(vertex);
    }
}

public interface IVertex<TVertex> where TVertex : IVertex<TVertex>
{
    public static abstract TVertex Copy(TVertex vertex);
}

public class Vertex : IVertex<Vertex>
{
    public static Vertex Copy(Vertex vertex)
    {
        return new Vertex();
    }
}

public class Vertex_Pro : Vertex, IVertex<Vertex_Pro>
{
    public static Vertex_Pro Copy(Vertex_Pro vertex)
    {
        return new Vertex_Pro();
    }
}
AlexeyZBest commented 2 weeks ago

You don't understand. During compilation, instantiation of the class Graph occurs. Once the class is instantiated, the compiler knows the type of vertex and the type of my_vertex in the line vertex.New_Copy(out my_vertex);. After that, the compiler can determine which function needs to be inserted. This can be implemented. It's just that you haven't implemented it yet.

AlexeyZBest commented 2 weeks ago

Closing out as by design.

Why not make it so that during the compilation of type specification Graph, the correct function address would be inserted in place of vertex.New_Copy(out my_vertex)?

Because the body of hte method is analyzed statically at compile type. So your call to New_Copy can only statically match against Vertex.New_Copy.

if you want virtual dispatch here you can do so either by making the method virtual. Or you can use static interface dispatch like so:

public class Graph<TVertex>
    where TVertex : IVertex<TVertex>
{
    public TVertex my_vertex;
    public Graph(TVertex vertex)
    {
        var copy = TVertex.Copy(vertex);
    }
}

public interface IVertex<TVertex> where TVertex : IVertex<TVertex>
{
    public static abstract TVertex Copy(TVertex vertex);
}

public class Vertex : IVertex<Vertex>
{
    public static Vertex Copy(Vertex vertex)
    {
        return new Vertex();
    }
}

public class Vertex_Pro : Vertex, IVertex<Vertex_Pro>
{
    public static Vertex_Pro Copy(Vertex_Pro vertex)
    {
        return new Vertex_Pro();
    }
}

You don't understand. During compilation, instantiation of the class Graph occurs. Once the class is instantiated, the compiler knows the type of vertex and the type of my_vertex in the line vertex.New_Copy(out my_vertex);. After that, the compiler can determine which function needs to be inserted. This can be implemented. It's just that you haven't implemented it yet.

AlexeyZBest commented 2 weeks ago

Закрытие как по замыслу.

Почему бы не сделать так, чтобы при компиляции спецификации типа Graph вместо вершины вставлялся правильный адрес функции. New_Copy(вне my_vertex)?

Потому что тело метода анализируется статически при типе компиляции. Таким образом, ваш вызов to может только статически совпадать с .New_Copy``Vertex.New_Copy

Если вам нужна виртуальная диспетчеризация, вы можете сделать это либо сделав метод виртуальным. Или вы можете использовать статический интерфейс диспетчеризации следующим образом:

public class Graph<TVertex>
    where TVertex : IVertex<TVertex>
{
    public TVertex my_vertex;
    public Graph(TVertex vertex)
    {
        var copy = TVertex.Copy(vertex);
    }
}

public interface IVertex<TVertex> where TVertex : IVertex<TVertex>
{
    public static abstract TVertex Copy(TVertex vertex);
}

public class Vertex : IVertex<Vertex>
{
    public static Vertex Copy(Vertex vertex)
    {
        return new Vertex();
    }
}

public class Vertex_Pro : Vertex, IVertex<Vertex_Pro>
{
    public static Vertex_Pro Copy(Vertex_Pro vertex)
    {
        return new Vertex_Pro();
    }
}

I understand correctly. Are you here to develop the C# language? Then please implement this feature. It is possible to insert a method reference at the compilation stage in the context I provided.

HaloFour commented 2 weeks ago

The New_Copy method on Vector_Pro is completely unrelated to New_Copy on Vector. The C# compiler and runtime only know of the Vector constraint, so they can only assume that New_Copy(out Vector) exists. Neither has any idea (nor cares) that a class derived from Vector might also have a method called New_Copy. C# generics are not like C++ templates, they are not expanded in source.

AlexeyZBest commented 2 weeks ago

The New_Copy method on Vector_Pro is completely unrelated to New_Copy on Vector. The C# compiler and runtime only know of the Vector constraint, so they can only assume that New_Copy(out Vector) exists. Neither has any idea (nor cares) that a class derived from Vector might also have a method called New_Copy. C# generics are not like C++ templates, they are not expanded in source.

That's exactly the problem. Are you involved in the development of C# here? Write so that the compiler could determine the address for the static method at the compilation stage. In this context, it is possible to determine which static method needs to be inserted, it's just that C# cannot do this yet.

HaloFour commented 2 weeks ago

It's very intentional that C# generics don't work anything like C++ templates. C# is not going to be changed so that generics will work like macros. Invoking a member of a generic constraint will only support virtual dispatch through that member.

AlexeyZBest commented 2 weeks ago

C# is not going to get C++-style macros/templates. How generics work today is very intentional.

Macros are not needed here.Here's how it should work:Let's say the compiler sees the lineGraph graf;it begins to instantiate the class Graph and encounters the following linevertex.New_Copy(out my_vertex);at the time of instantiation, the compiler knows the type of my_vertex and the type of vertex. It understands that it needs to insert the static function public void New_Copy(out Vertex_Pro vertex_pro){vertex_pro = new Vertex_Pro();}. And macros are not needed at all here.

AlexeyZBest commented 2 weeks ago

Macros are not needed here.Here's how it should work:Let's say the compiler sees the line Graph graf;it begins to instantiate the class Graph and encounters the following line vertex.New_Copy(out my_vertex);at the time of instantiation, the compiler knows the type of my_vertex and the type of vertex. It understands that it needs to insert the static function public void New_Copy(out Vertex_Pro vertex_pro) { vertex_pro = new Vertex_Pro(); }. And macros are not needed at all here.

AlexeyZBest commented 2 weeks ago

It's very intentional that C# generics don't work anything like C++ templates. C# is not going to be changed so that generics will work like macros. Invoking a member of a generic constraint will only support virtual dispatch through that member.

I'm sorry. I am communicating through a translator, and when I called the function static, I meant a function whose address is determined at compile time.

HaloFour commented 2 weeks ago

The generic instantiation doesn't happen at compile-time, it happens at runtime. Having the compiler do it at compile-time and emit additional source is what makes C++ templates effectively macros, and C# generics work very differently.

AlexeyZBest commented 2 weeks ago

The generic instantiation doesn't happen at compile-time, it happens at runtime. Having the compiler do it at compile-time and emit additional source is what makes C++ templates effectively macros, and C# generics work very differently.

Well, okay. During the instantiation of a specific class at runtime, it is possible to determine which function address should be inserted. After all, at the time of instantiation, all types are known and C# can determine exactly which function to insert. I'm right, aren't I?

HaloFour commented 2 weeks ago

No, the only method that the runtime is aware of is Vector.New_Copy, which it will call via virtual dispatch. If you were to override that method from Vector_Pro then yes that method will be invoked. The fact that Vector_Pro has a different and unrelated method overload also called New_Copy doesn't come into consideration. The name isn't important.

AlexeyZBest commented 2 weeks ago

No, the only method that the runtime is aware of is Vector.New_Copy, which it will call via virtual dispatch. If you were to override that method from Vector_Pro then yes that method will be invoked. The fact that Vector_Pro has a different and unrelated method overload also called New_Copy doesn't come into consideration. The name isn't important.

тут

No, the only method that the runtime is aware of is Vector.New_Copy, which it will call via virtual dispatch. If you were to override that method from Vector_Pro then yes that method will be invoked. The fact that Vector_Pro has a different and unrelated method overload also called New_Copy doesn't come into consideration. The name isn't important.

Is C# being developed here, changes being made to C#, it being refined and evolved? Are there people here who speak Russian?

CyrusNajmabadi commented 2 weeks ago

You don't understand. During compilation, instantiation of the class Graph occurs. Once the class is instantiated, the compiler knows the type of vertex and the type of my_vertex in the line vertex.New_Copy(out my_vertex);. After that, the compiler can determine which function needs to be inserted.

No. It can't. Generics are not templates. The body code for them is not determined at the callsite, but at the declaration site. As such, the determination of this is made from the original code, statically.

If you want a different call to be made, you need to involve virtuals, and i showed how to do this. Note: this is simply virtual at the type system level. At the runtime level, it will be a direct call.

CyrusNajmabadi commented 2 weeks ago

That's exactly the problem. Are you involved in the development of C# here? Write so that the compiler could determine the address for the static method at the compilation stage.

We have. I showed the feature that exactly solves this. That's why i closed this out. There is a solution, and it is one the runtime and language both developed together to exactly solve thsi problem space.

CyrusNajmabadi commented 2 weeks ago

Let's say the compiler sees the lineGraph graf;it begins to instantiate the class Graph and encounters the following linevertex.New_Copy(out my_vertex);at the time of instantiation,

The compiler has no idea what the impl of public Graph(T_type vertex) is whatsoever. in .net we distribute metadata for code as opaque signatures in 'ref assemblies'. The compiler literally only sees public Graph(T_type vertex). Furthermore, the compiler does not specialize a generic at the callsite. Again, these are generics, they are not templates or macros. There is no 'code' for the compiler to examien at the callsite that it could even inject there.

That's why we solved this with the approach i listed already. The original code explicitly calls out that it would like the particular method stubbed in, and the runtime does that at instantiation time.

CyrusNajmabadi commented 2 weeks ago

Well, okay. During the instantiation of a specific class at runtime, it is possible to determine which function address should be inserted.

This is exactly what hte code i showed you does. This feature exists. You just write it in the fashion i presented :)

Is C# being developed here, changes being made to C#, it being refined and evolved?

Yes. And i'm one of the lang designers. We continue to refine and evolve the language. However, we don't start with "let's make this code legal", we start with "what is the scenario we want to make possible". "Specialized dispatch to particular methods based on types" is the scenario your use case falls under. And we have a solution for that. We like that solution. it works, and fits very well into the models of the language, compiler and runtime. As it subsumes your use case, and does so well (along with many others), it is the solution we will stick with.

We do not develop new features, to support use cases that have solutions, just because someone isn't willing to use it :)

AlexeyZBest commented 2 weeks ago

You don't understand. During compilation, instantiation of the class Graph occurs. Once the class is instantiated, the compiler knows the type of vertex and the type of my_vertex in the line vertex.New_Copy(out my_vertex);. After that, the compiler can determine which function needs to be inserted.

No. It can't. Generics are not templates. The body code for them is not determined at the callsite, but at the declaration site. As such, the determination of this is made from the original code, statically.

If you want a different call to be made, you need to involve virtuals, and i showed how to do this. Note: this is simply virtual at the type system level. At the runtime level, it will be a direct call.

Right at the call site, it should be determined which function is to be called. It just needs to be implemented.

AlexeyZBest commented 2 weeks ago

We have. I showed the feature that exactly solves this. That's why i closed this out. There is a solution, and it is one the runtime and language both developed together to exactly solve thsi problem space.

The code that you provided shows an error. There is a version mismatch.

CyrusNajmabadi commented 2 weeks ago

it builds fine fo rme:

Build started at 7:26 PM...
1>------ Build started: Project: ConsoleApp1, Configuration: Debug Any CPU ------
1>D:\source\ConsoleApp1\ConsoleApp1\Program.cs(5,12,5,17): warning CS8618: Non-nullable field 'my_vertex' must contain a non-null value when exiting constructor. Consider adding the 'required' modifier or declaring the field as nullable.
1>ConsoleApp1 -> D:\source\ConsoleApp1\ConsoleApp1\bin\Debug\net8.0\ConsoleApp1.dll

What error are you getting?

CyrusNajmabadi commented 2 weeks ago

Right at the call site, it should be determined which function is to be called

It can't be. All the call-site sees is public Graph(T_type vertex). There's no information about the body at all.

AlexeyZBest commented 2 weeks ago

The compiler has no idea what the impl of public Graph(T_type vertex) is whatsoever. in .net we distribute metadata for code as opaque signatures in 'ref assemblies'. The compiler literally only sees public Graph(T_type vertex). Furthermore, the compiler does not specialize a generic at the callsite. Again, these are generics, they are not templates or macros. There is no 'code' for the compiler to examien at the callsite that it could even inject there.

That's why we solved this with the approach i listed already. The original code explicitly calls out that it would like the particular method stubbed in, and the runtime does that at instantiation time.

You've implemented everything in a complex way. At the time of instantiation, it's possible to determine the function that should be called in the specific class.

AlexeyZBest commented 2 weeks ago

it builds fine fo rme:

Build started at 7:26 PM...
1>------ Build started: Project: ConsoleApp1, Configuration: Debug Any CPU ------
1>D:\source\ConsoleApp1\ConsoleApp1\Program.cs(5,12,5,17): warning CS8618: Non-nullable field 'my_vertex' must contain a non-null value when exiting constructor. Consider adding the 'required' modifier or declaring the field as nullable.
1>ConsoleApp1 -> D:\source\ConsoleApp1\ConsoleApp1\bin\Debug\net8.0\ConsoleApp1.dll

What error are you getting?

Do you know Russian?

CyrusNajmabadi commented 2 weeks ago

At the time of instantiation, it's possible to determine the function that should be called in the specific class.

How? The code that was compiled says "Call Vertex.New_Copy". It was already determined. That cannot change (it would literally break all programs depending on this). Vertex_Pro.New_Copy is an entirely unrelated method. It has no relation with Vertex.New_Copy. None whatsoever.

CyrusNajmabadi commented 2 weeks ago

Do you know Russian?

No. But the code compiles just fine. You can see that here as well: https://sharplab.io/#v2:CYLg1APgAgzABFATHA4gJwIYAcAWAeAFQDUBTNAFxIA8A+AWACg5m4B3HMkuYsyquEHACSpCtUKi+9BgG9GLBPB5j+AWwCeAfQBuvagG55LWKky4AFMr5xdKgJRHmcpgpbaMaOAGMA9lnVwALzcktQAdADCfurmtnx2hi7MAL6MqQyMJgCWAHaUaABmGF5cInpUEuU0bBxoXFbUAsKhFQ20jM4KJlAAjABscBgARgDO5Jhe5CHlcFH+li025QlpjGsM3ciLgmUqeC3SncbwvQOLczGLcdQOSXBHrggA7HA5JKxwLeYrd+npmSctuVNAAFNA+JotAA0zXK+2BYJ8h0cigQ/U+CPBs2i5haoKx1yotwUD1cUBebw+eMR30SCj+jCAA

CyrusNajmabadi commented 2 weeks ago

You've implemented everything in a complex way.

We've implemented to exactly allow the declaration site to state explicitly that it is not calling some specific function, but a function dependent on the instantiation of the generic. The declaration site now has this information baked in, and it works without the callsite compilation having to do anything. It also gives you exactly what you want, where the runtime can "determine the function that should be called in the specific class"

And, unlike with your code, the solution we have does have a relation here. Because the 'Copy' methods are implementations of the static-abstract method in the interface. So the methods state explicitly that they are the impls to call when doign static virtual dispatch through that method.

CyrusNajmabadi commented 2 weeks ago

@AlexeyZBest if you're running into a compiler error, please link to a sharplab.io repro that shows the issue. I'll have an idea about what you're running into at that point. Thanks! :)

AlexeyZBest commented 2 weeks ago

How? The code that was compiled says "Call Vertex.New_Copy". It was already determined. That cannot change (it would literally break all programs depending on this). Vertex_Pro.New_Copy is an entirely unrelated method. It has no relation with Vertex.New_Copy. None whatsoever.

C# needs to be refined to work correctly. Are you developing C# here?

HaloFour commented 2 weeks ago

C# is working correctly. It was designed to avoid the pitfalls that come with treating templates like macro expansions as they are in C++. Behaviors as you described is intended to be handled in a much more explicit way, depending on virtual dispatch.

CyrusNajmabadi commented 2 weeks ago

C# needs to be refined to work correctly.

It has been. The language feature we added exactly solves thsi use case. And it does so in a way that does NOT break existing code. And which allows the callsite to exactly encode the desired behavior here, and which works properly within the existing compilation model of .net (which does not embed source of the declaration into the compiled result for use at the callsite).

We are not C++. First, we don't recompile callsites against the original source at the decl site (that wouldn't even work given that we're a cross language runtime). Second, teh runtime doesn't 'guess' at alternative methods to call that have no relation to the original methods. Instead, dispatch happens to exactly what the calling site asked for. If another method is desired to eb resolved at runtime, then the solution to that is virtual dispatch. That's why the solution here was designed as it was. It fits both the goals and pillars of both the language and runtime. And it solves your use case and many many many others. It does so uniformly and consistently.

Are you developing C# here?

Yes. As i said, i'm one of the language designers on the team :)

AlexeyZBest commented 2 weeks ago

C# is working correctly. It was designed to avoid the pitfalls that come with treating templates like macro expansions as they are in C++. Behaviors as you described is intended to be handled in a much more explicit way, depending on virtual dispatch.

New_Copy

You do realize that the function New_Copy can be definitely determined at the time of instantiation, right?

CyrusNajmabadi commented 2 weeks ago

You do realize that the function New_Copy can be definitely determined at the time of instantiation, right?

Yes. And it is deterministically determined to be Vertex.New_Copy. Because that's precisely what hte original code asked to be run. And there is no virtual dispatch of any sort going on.

There is no relation at all between Vertex.New_Copy and VertexPro.New_Copy. They are completely unrelated methods. Since htere is no virtual dispatch, and these are unrelated methods, the CLR properly calls Vertex.New_Copy as the code specifies.

Changing this would also be a breaking change, as has been mentioned several times.

If you want virtual dispatch to some method that is specific to the type the generic was instantiated with, that is already supported. We're not going to make a new way to do what is already possible, especially as that new way will break the meaning of all generic code today.

AlexeyZBest commented 2 weeks ago

@AlexeyZBest if you're running into a compiler error, please link to a sharplab.io repro that shows the issue. I'll have an idea about what you're running into at that point. Thanks! :)

2024-05-13 (4) please )

CyrusNajmabadi commented 2 weeks ago

So that seems to be correctly telling you that you need to upgrade your language version. You're on version 9, and that feature wasn't available with that version.

CyrusNajmabadi commented 2 weeks ago

Note: the above code is also good because it clearly states at the declaration point that the return type of Copy depends on the containing type. So if you call through a derived type, you get a derived type back from the Copy method in a completely typesafe fashion.

If this was just looking up by name, the new method might return some other type entirely, which could break all successive code in the method. For example;

    public void New_Copy(out int vertex_pro)
    {
        vertex_pro = 0;
    }

Uh oh! Now instead of having a Vertex at the callside, you have an int, which will certainly break most code that follows that was trying to use that value.

AlexeyZBest commented 2 weeks ago

So that seems to be correctly telling you that you need to upgrade your language version. You're on version 9, and that feature wasn't available with that version.

So that seems to be correctly telling you that you need to upgrade your language version. You're on version 9, and that feature wasn't available with that version.

Because you've applied a poor solution. The beautiful solution is when everything happens behind the scenes and without any interfaces.

CyrusNajmabadi commented 2 weeks ago

Because you've applied a poor solution.

What does that have to do with an error telling you you're on an older language version? Any change we make to support whatever you want will have to come with a new language version. Which means you'll have to upgrade to use it. If you don't, you'll get a message saying you're on too old a version.

The beautiful solution is when everything happens behind the scenes

We disagree. Silently changing what you said you called, and randomly calling other methods (how would you even determine which to call?) is not a good thing to us.

We designed generics to not have the drawbacks of templates here. That was intentional. You may not like that. But we're not going to upend everything and rollback to that approach, and lose all hte benefits we think generics brings. I'm sorry if that isn't satisfactory to you. If you want, you can fork c# and change it however you want. But these decisions were intentional and reasoned through. Based on many criteria around language and runtime design we think is very important.

We believe it's important for generics to statically define what they call at the declaration site, not be encoded in any sort of source-fashion, and only dispatch to different codepaths through explicitly set up virtual connections. You disagree. We seem to be an an impasse.

As i've shown above (like with https://github.com/dotnet/csharplang/issues/8115#issuecomment-2106709408) templates are not a good solution in our opinion. You may disagree based on your own criteria. Ultimately though, without enormously good reason (which you've given none of), we're not going to change this. The things you like are things we do not. And the things you do not like are the things we do. if you are not ok with this, then fork things and take the language where you want it to go.