Closed procodix closed 4 years ago
B.Create() creates an "A" which is completely wrong from an inheritance perspective and the opposite of what a non-static method would do.
How would you envision this working? When B.Create
is called, it is statically dispatching into A.Create
. So from the perspective of A
there is no way to know what value shoudl actually be instantiated.
Why not? The code clearly says B.Create()
So I would expect the compiler to see B as current static type, calling it's method B.Create() which it doesn't implement on it's own, but inherited from A. Meaning, that every occurence of "self" in the implementation gets assumed as a B type.
BTW, self as return type implies covariance :-)
Why not? The code clearly says B.Create()
Because that's what's in the IL. There is no 'Create' method in 'B', and A.Create
has no clue that B even exists (it might be in a different assembly altogether. As A.Create
is a static method, it is passed in no information about how it was called.
So I would expect the compiler to see B as current static type
Theres' no concept of 'current static type'. If you look at the IL actually generated here for B.Create
you'll see that B
never is even referenced.
calling it's method B.Create()
To clarify that. "it's" implies that B
has a Create
method. It does not. A
has hte Create method, and the language just lets you find it by saying B.
. But what it finds is really A.Create
which you'll see is what it emits an invocation of.
Meaning, that every occurrence of "self" in the implementation gets assumed as a B type.
How do you envision this working?
So in the compiled assembly we only get an A.Create() method. FIne.
My simplest (and naive) approach would be the following procedure during compilation:
If I had wanted A.Create() I would have written A.Create(), but I wrote B.Create() and that is in the source - thus should be interpreted as outlined. B.Create() calling some A.Create() is meaningless.
To clarify that. "it's" implies that B has a Create method. It does not. A has hte Create method, and the language just lets you find it by saying B.. But what it finds is really A.Create which you'll see is what it emits an invocation of.
Just 2 cents to this "finding" routine. It's useless, because it tries to solve a problem that is none. If I need A.Create() I write A.Create().
As B does not have a Create() method, copy and paste it from A to B. Replace "self" placeholder with B and emit the thing as B.Create() IL.
When compiling B (or someone calling B.Create
) , the compiler may have not have any access to the IL of 'A'. For example, there "ref assemblies" are a thing that exists where the body of A.Create
is not included in A.dll. Instead, only the signatures are in there, and the compiler only uses the ref-assembly to make sure what B.dll is calling exists.
Just 2 cents to this "finding" routine. It's useless, because it tries to solve a problem that is none. If I need A.Create() I write A.Create().
We cannot relitigate the past. This has been the specified behavior for C# since 1.0.
If I had wanted A.Create() I would have written A.Create(), but I wrote B.Create() and that is in the source - thus should be interpreted as outlined.
Changing the interpretation of existing code can lead to breaking changes. Something that will not generally happen.
B.Create() calling some A.Create() is meaningless.
It's had meaning for nearly 20 years now :)
The feature request allows for clear, intuitive static method inheritance. Something C# is missing. I think, I elaborated on your question how I would envision this to work. There is no point striking out the current mechanics of IL, because they are, at least here, well, insuficcient. So lets work on my request's metadata point-by-pint.
cannot relitigate the past.
That's what a feature request is good for. C# is a product of the past, it shouldn't be a prisoner of it.
This has been the specified behavior for C# since 1.0. [...] It's had meaning for nearly 20 years now
If the founding fathers had been perfect we wouldn't have version 8.0 now ;-) But I think you agree to my last point. The current behaviour that reroutes B.Create() to A.Create() obviously reveals "some" functionality, which however is completely redundant and therefore unneccessary. One could say more politely: it has little use ;-)
can lead to breaking changes.
Thats completely preventable. A new keyword "self" completely separates classic functionality from the modern one. Therefore no breaking change occurs whatsoever. If a method is implemented without "self" inside, everything stays as it is now - for the old folks. However a use of "self" lets the suggested logic kick in.
Something that will not generally happen.
Fortunately nothing I requested ;-)
As A.Create is a static method, it is passed in no information about how it was called.
Thats why I suggested, that the compiler copies the method over to B and then calls B.Create(). Granted, that workflow is not the most efficient one but it would get the job done. Rocket scientist will come up with a better idea when covariance gets implemented.
Haven't seen a reason why this shouldn't work, hm?
That's what a feature request is good for.
Feature requests that don't change the meaning of existing code: great. feature requests that change the meaning of existing code: nearly insurmountable.
Just letting you know this :)
C# is a product of the past, it shouldn't be a prisoner of it.
You're welcome to want htat. I'm just explaining that that will make the chance of your proposal happening orders of magnitude less likely. To give you an idea, C# has only ever had about 1 real breaking change in behavior at this level.
There is no point striking out the current mechanics of IL, because they are, at least here, well, insuficcient.
Right. But that's why i'm asking you to explain how it would work. Because any proposal suggesting this sort of change will need to explain how it can actually work in the existing ecosystem.
So, for example, a propsal that says "the compiler needs to be able to access the IL to copy it over to the new dll" won't really work, since in the existing ecosystem tehre exist tons of DLLs that do not ship with IL to copy :)
Delphi's Object Pascal language had something like this - it didn't have static methods, it had class methods that could be called without an instance of the class. They could be virtual (and thus be overridden by subclasses).
You could even pass them around - I have vague memories of methods accepting the TComponent
class that used the class reference to create instances (factory pattern).
If something like this feature were to progress, it would have to leave all the existing static declarations untouched; breaking tens of thousands of existing projects just won't fly.
But a new method type - perhaps (re)using the Delphi keyword class - might just work.
The crux will be the cost/benefit ratio - remembering that every language feature starts, not at zero, but at -100 points.
could this work as generic do at an IL level - a constrained parameter passed in?
that would make this compile to something like what static T create\<T>(...) where T:typeof(caller),A would be ?
so B.Create() is a call to A.Create \<B>(....)
the use 'self' keyword would limit compiletime access via sourcecode to this function; not sure how the interop would end up working ....
@ufcpp can you comment about why #252 is related to this, some example of code maybe?
could this work as generic do at an IL level - a constrained parameter passed in?
1) This whole thing can be circumvented, if I used instances members. They inherit properly. Buit sometimes you don't want to instantiate an object, because you just need some information on the class - not the object. Chainable Factory methods as in the example arre a second reason.
2) Generics shouldn't be used for this for at least two reasons:
TL,DR: yes, but they do it way too complicated. Feature request is plain and simple.
Thanks, but let me repeat clearly: There is no breaking change and there is no language change requested here. We are talking about a new keyword with som compiler magic taking place. C# had plenty of changes like this ("await").
No need to wake a sleeping giant. We are NOT talking about redefining + operator to multiply fom now on.
You're welcome to want htat. I'm just explaining that that will make the chance of your proposal happening orders of magnitude less likely. To give you an idea, C# has only ever had about 1 real breaking change in behavior at this level.
We are talking about a new keyword with som compiler magic taking place.
How would the compiler implement it?
The compiler event has no knowledge about what A.Create()
does. So it cannot decide what B.Create()
does when applying any magic.
a static method should inherit from it's single defined parent.
No, static methods are never inherited. It just has accessibility associated to class, like nested types.
If you want static method to have some dynamic/polymorphic semantics, it will falls into type classes(#110).
Thanks, but let me repeat clearly: There is no breaking change
Then how does your feature work? If i have static method A.Create()
how can it end up dynamically doing things differently depending on who called it? Literally by what mechanism would that work?
and there is no language change requested here.
If there's no language change requested... then what are you asking for? :)
We are talking about a new keyword with som compiler magic taking place.
That is the very definition of a language change :)
C# had plenty of changes like this ("await").
'await' does not change how the method executes depending on how the caller calls it. Furthermore, the callee doesn't affect the caller eithe with async/await.
Here's a good way to tell:
When you have an async Task
method that gets compiled into a dll into a method that just returns Task
. There is nothing in the dll that indicates that either async
or await
was used. If someone calls this, they'll have no idea how it was implemented. The callee can change their impl at any point.
--
That's not how your feature here works. You're requiring that the caller have to know how the callee was implemented in order to figure out what to do. Namely the caller (B.Create()
) would have to know about A.Create()
so it could somehow "copy the IL" and make a suitable version of it that then instantiated a new B
instead of a new A
.
@procodix before you proceed, let's start simple:
Say you have this code:
public class A {
public static self Create() {
return new self();
}
}
How would you propose this actually be encoded in IL? Feel free to actually just synthesize what you'd want the C# compiler to emit for something like this. If you don't want to write IL, just write the equivalent C# code this would transform into.
Note: i don't even care about the caller right now of B.Create
(though you can add that information if you want). I literally only care how A.Create
will actually be compiled.
I still don't understand the premise. Given self
, what would you actually be able to do with it? The only thing you know at compile-time is that it's derived from A
. You can't do new self()
because it might not have a parameterless constructor (meaning that your original example is already impossible, regardless of any other considerations). You can't call any static methods which aren't already on A
.
Maybe I've missed something, but I can't find any other examples of usage.
Could you give some examples of what this would be used for which actually make sense at a language level, regardless of backwards compatibility and implementation?
That is the very definition of a language change :)
Call it as you want, but this has happend more than once to the language. It's not Halley's Comet. You can't have it both ways ;-)
So, for example, a propsal that says "the compiler needs to be able to access the IL to copy it over to the new dll" won't really work, since in the existing ecosystem tehre exist tons of DLLs that do not ship with IL to copy :)
Unless someone builds a time machine, the self keyword and the connected logic is not present in any DLL. And it will never have to be. It's a compiler feature the works during compilation not during execution. So foreign DLLs will behave completely deterministic.
Namely the caller (B.Create()) would have to know about A.Create()
During compilation the caller is known, because it's written down there. So what?
During compilation the caller is known, because it's written down there. So what?
The caller is known. The callee is not. All you know is that there is a static, parameter-less Create
method in A
.
--
But, again, can you just answer: https://github.com/dotnet/csharplang/issues/2841#issuecomment-536901880
What would you actually compile this into?
Call it as you want, but this has happend more than once to the language
When has it happened before? You mentioned async/await
but they definitely did not do anythin akin to what you're describing here. Like i explained in https://github.com/dotnet/csharplang/issues/2841#issuecomment-536901266 all that stuff is completely transparent to the caller.
public class A {
public static self Create() {
return new self();
}
}
becomes
public class A {
public static A Create() {
return new A();
}
}
so nothing changes.
However when class A : B {...} gets declared following method gets injected:
public class B {
public static B Create() { // thus covariance during inheritance is important!
return new B();
}
}
It works the same way as with instance methods.
The caller is known. The callee is not. All you know is that there is a static, parameter-less Create method in A.
Sorry, I meant the callee. The parser reads "B".Create() So "B" is callee. I don't understand your point. Somewhere your "find" procedure has to start to search for a metod to call. Where does it start? When it reads "B.Create()" it known, that the method should be called on B. As it does not exists, the search starts up the ancestors an reveals that parent A has a .Create()) method to which the call is exchanged. Skip the exchange. Call .Create() on B.
public class A { public static self Create() { return new self(); } }
What if you then have:
class B : A
{
public B(string something) { }
}
How would B.Create()
work then?
Could you give some examples of what this would be used for which actually make sense at a language level, regardless of backwards compatibility and implementation?
There are plenty of use cases. Factory methods for example. A needs some methods to create an instance from itself. B&C&D inherit from A. I don't want to copy the factory methods over. With inheriting the static members, I can simply reuse them. Thats the point of OOP.
Another usage is sgtoring information about a class not an object inside the class. You could use attributes, but they are readonly. Take for example an ORM class. It has a method that returns its database table name
namespace Test {
class A {
public static string Table() {
return self.FullName.Replace('.', '-'); // returns "Test_A"
}
public void Save() {
Database.SaveToTable(self.Table())
}
}
class B : A {
}
}
B.Table(); // returns "Test_B"
B.Save(); // writes to the correct table
In your last example, where is FullName
declared?
As I said earlier, factory methods won't work without some guarantee that all subclasses of A
will have parameterless constructors. Given that they need to have parameterless constructors (and ensuring that has its own can of worms), it's not clear what you factory will do.
@canton7: How would B.Create() work then?
B.Create() calls A.Create()'s implementation, thus executing:
public static self Create() {
return new self();
}
This calls the C# default constructor and returns an empty B().
It should be allowed to override B.Create() as other languages do it by coding:
class B : A {
public static self Create() { // self is assumed as current Type = B()
self instance = super.Create(); // calling inherited but inivisble B.Create() which calls A.Create() and returns a B().
instance.SetSomeProperty(true);
return instance; // returns B() or child, if inherited later
}
}
This calls the C# default constructor and returns an empty B().
Please read the bit I wrote before I said "How would B.Create() work then?". There, I declared B as having a non-parameterless constructor.
In your last example, where is FullName declared?
This comes from self being a Type instance.
In your last example, where is FullName declared?
This comes from self being a Type instance.
If self
is a Type instance, you can't do new self()
.
There, I declared B as having a non-parameterless constructor.
B inherits from A, so A would require a non-parameterless constructor as well. Then A.Create() would have required to call this non-parameterless constructor:
public class A {
public static A(string aParameter) { ... }
public static A Create() {
return new A("a string parameter");
}
}
B inherits from A, so A would require a non-parameterless constructor as well.
That isn't how C# works. This is valid C#:
class A { }
class B : A
{
public B(string s) { }
}
This is also valid C#:
class A
{
public A(string s) { }
}
class B : A
{
public B() : base("Foo") { }
}
And so is this:
class A { }
class B : A
{
private B() { }
}
A derived class's constructor can have more parameters than its parent's constructor, or fewer parameters, or it can stop things constructing it at all (by making its constructor private).
If
self
is a Type instance, you can't donew self()
.
Yes, this is syntactic sugar. Most times self would be best represented as a Type in return or parameter statements, as well as when reading out the Type's name as demonstrated.
Placed after new it should obviously lead to IL code creating an instance of that type, similar to what Activator.CreateInstance(self.Fullname) would do.
@canton7 finding the right constructor has some logic now. Classes have empty constructors or by overloading parametrized ones. The compiler can resolve this at compile time. When I call B.Create() and that leads to A.Create() which hosts some instructions that can not (or no longer) be executed on B, this means a syntax error.
Here an example, of how this is communicated today:
class Uri2 : Uri {
public Uri2() {}
}
leads to the error: 'Uri' does not contain a constructor that takes 0 arguments (CS1729)
@canton my Uri example would get resolved by adding:
class Uri2 : Uri {
public Uri2() : base("http://localhost") {}
}
to satisfy the requirements. That's what A & B programmers would have to do as well.
Let's say you have:
class A
{
public static self Create() => new self();
}
class B : A
{
public B(string s) { }
}
So your proposal is, if you call B.Create()
, the compiler would peer inside the implementation of A.Create()
(which might not be available at compile-time), determine that it tries to (effectively) call new B()
, and would raise an error on B.Create()
?
Just wanted to emphasize, this is not theory. It works very well in PHP and is very powerfull in the use cases demonstrated above. There is static:: which always represents the current class'es name as string. For example this is pretty neat & reusable code for a trait - something like the new interfaces with default implementations which get injected into classes:
trait InstanceTrait {
public static function Instance() {
return new static; // creates instance of whatever class this is called from
}
public function ToString() {
return 'Instance of : ' . static; // even works in instances to print the current class name
}
}
PHP can do many things that C# cannot. C# can do many things that PHP cannot.
PHP is late-bound, C# is early-bound. C# gives many more compile-time guarantees than PHP does, but the cost is less flexibility at compile-time.
You're trying to take something which works in PHP because PHP is late-bound, and applying them to C#, which needs to give many more compile-time guarantees. There are problems there, and we're asking how your proposal addresses them.
Also, it really does sound like generics already meet your use-case:
class Factory
{
public static T Create<T>() where T : A, new() => new T();
}
Factory.Create<A>();
Factory.Create<B>();
If you really want your nice syntax, then:
class A
{
public static A Create() => Factory.Create<A>();
}
class B
{
public static B Create() => Factory.Create<B>();
}
You can also do typeof(T)
to get the Type
instance.
Given that the language already appears to support what you want, with only a couple of additional lines of boilerplate, I think you're going to have a really hard time convincing the LDM to support this particular proposal.
@canton7
So your proposal is, if you call B.Create(), the compiler would peer inside the implementation of A.Create() (which might not be available at compile-time), determine that it tries to (effectively) call new B(), and would raise an error on B.Create()?
Exactly. Since this is clearly a compile time error.
BTW. You are absolutely right about the different worlds PHP and C# live in. Mentioning it didn't mean to ignore the conceptual differences, its just proof-of-concept that there is a viable path to getting to an executable method in all cases when static:: is used. The implementation detail in PHP was named "Late static binding", but in PHP everything is late, because it is interpreted, most code is even included at runtime. They may call it "Late static binding" because they carry the latest class name on which a static function was called around throughout the stack. But that leads to confusion when a static method jumps into a dynamic one of another class. Which class does static:: represent then?So the naming is more an explanation for the bad implementation.
C# however due to strong typing could realize it at compile time as pointed out earlier. If a method is missing, simply throw a compile error.
Swift does it as well with "self" and "Self", the latte being for static:: See here: https://kirilltitov.com/en/blog/2017/capitalized-self-in-swift
However all questions asked so far put the spotlight on the central instruction: What happens at B.Create()? And that was my inital point: Its written in the source code, so it should be called If it does not exists => error If it exists in parent class => copy it over, do the type mangling and call it.
So your proposal is, if you call B.Create(), the compiler would peer inside the implementation of A.Create() (which might not be available at compile-time), determine that it tries to (effectively) call new B(), and would raise an error on B.Create()?
Exactly. Since this is clearly a compile time error.
The problem here is that this is simply not possible. As @CyrusNajmabadi said earlier, A
might be in a reference assembly, which means that its implementation simply isn't available to the compiler.
If it exists in parent class => copy it over, do the type mangling and call it.
Again, if A
is in a reference assembly, there's nowhere to copy it from!
Also, simply copying A.Create
's implementation into B
would means that if B
doesn't have a parameterless constructor (or a constructor which matches A
's constructor), you would get a compiler error at the point that you try to compile B
. This is different to the behaviour you just said you wanted, which is that the compile-time error happens at the point that you call B.Create()
.
It's really hard to take a proposal seriously when the proposer keeps contradicting themselves on what they're proposing.
So the only problem arises, when the code is in two different assemblies?
There are plenty of other problems, but let's take one at a time.
Ok, I am not too familiar with the IL specifics. But we could try another apporach:
What about the call to B.Create() would instead
That would prevent the type mangling at compile time / at IL level. It would require the compiler to emit additional Opcodes after the return jump and stack pop to cast the returned "self" object to the provided classname "B".
Something like this:
public class B : A {
// automatically generated method:
public static B Create() {
return (B) A.Create();
}
}
You cannot take an instance of A
and then cast it to B
. That's not possible.
For an intuitive idea why: B
might have fields / virtual methods which A
does not have. When you do new A()
, you allocate enough space for A
's fields, but not B
's. If you were then able to cast that instance of A
to B
, there would be nowhere for B
's fields to be stored.
Couldn't find an open issue regarding this, but the feature is too important to not be reopend at least every month until implementation ;-) I am sure, this has been dicussed previously.
I know that when you derive a class containing static methods, calls to the child's static methods are rerouted to the parent class' implementation. But this behaviour as standard is just so wrong in many ways. It should at most be optional.
Factory methods are not inheritable:
B.Create() creates an "A" which is completely wrong from an inheritance perspective and the opposite of what a non-static method would do.
Instead there should be a "this" or "self" keyword referencing the static type at call time - not the declaring class.
The worst possible naive implementation in the compiler would be to copy the method over to all derived classes, replacing A with B, but there is a smarter way for sure.
Trivia: The PHP folks needed several years to understand the concept of Late static binding. Until then, only self:: existed, which really worked like an alias for CLASS, the declaring class. Only after lots of pressure did they come up with static::, which resolves to the currently called class and behaves correctly as described above.