ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.81k stars 2.47k forks source link

vtable abstraction of some kind. traits? oop? polymorphism? interfaces? #130

Closed babariviere closed 3 years ago

babariviere commented 8 years ago

That will be good if we have traits like in rust.

andrewrk commented 8 years ago

Indeed, perhaps we need an abstraction that provides a vtable of some kind. Traits seem pretty reasonable. I still need to think it all over before deciding on a particular abstraction.

One thing I want to do before closing this issue is create a GUI application with widgets and stuff. That's the classic use case for OOP so we can test out how well the vtable abstraction works.

kyle-github commented 7 years ago

Structural polymorphism? I.e. if I have:

` const A = struct { foo: u16; }

const B = struct { bar: A; baz: f32; } `

Can you pass a pointer to B where a pointer to A was needed?

kyle-github commented 7 years ago

It looks like if you allow fields to be reordered by the compiler, then this would not be possible. That will prevent a certain kind of polymorphism from working. This is fine as long as it is explicitly stated somewhere. See the points in issue #307.

andrewrk commented 7 years ago

Can you pass a pointer to B where a pointer to A was needed?

do_something_with_a(&b.bar)
kyle-github commented 7 years ago

OK, so that would not work if for some reason the compiler could reorder fields (it might work, but it would not be guaranteed).

Even though types A and B are different, you do not have to cast the B pointer?

C lets you cast and guarantees that the order of the fields in the struct definition is the order in memory. So, in C, this always works.

I tried to cover this use case in issue #488.

noonien commented 6 years ago

Please consider taking inspiration from Golang's interfaces. They greatly help with making code easy to write and understand.

For example, all types that can read/write buffers of data, and expose the functionality with methods with the correct name and signature are considered to implement io.Reader/io.Writer by default.

This makes working with, and abstracting streams(and other things) really easy.

phase commented 6 years ago

Go's interfaces aren't the best way to do things. Structural typing seems to go against the Zen of Zig.

alexnask commented 6 years ago

In my opinion, zig's metaprogramming features are strong enough that this can reasonably be written as a library feature.

I'm just going ahead and drop a link to a proof of concept implementation (written before pointer reform): https://gist.github.com/alexnask/1d39fbc01b42ce2b5b628828b6d1fb46

This implementation pretty much works like Rust traits, the interface struct either owns or points to the implementation object and the vtable pointer and the vtables are generated at compile time, one time for each new implementation type.

EDIT: The code has been updated to master zig.

binary132 commented 6 years ago

Quick thought: having a really tiny feature set is actually a very strong language feature. Maybe this kind of abstraction should be in a library that can be imported to get extra abstraction macros? Then people can still use the core language without having to understand many features. (This echoes @alexnask's thought above.)

@phase Zig already has duck typing in comptime, doesn't it?

ghost commented 6 years ago

@andrewrk If you do decide to go with type composition via Rust-style traits, one thing for implementing GUIs via composition is to look into something called "Entity-Component-System": https://raphlinus.github.io/personal/2018/05/08/ecs-ui.html

0joshuaolson1 commented 6 years ago

@RUSshy Ignoring the nonconstructive behavior, check out the discussion on comptime interfaces.

0joshuaolson1 commented 6 years ago

@RUSshy Zig isn't competing with Jai, which doesn't allow anything because it hasn't been released.

andrewrk commented 6 years ago

be_excellent_to_each_other-924x784

plz

Hejsil commented 5 years ago

Another reason we might want this is that the current "interface field" pattern is very unfriendly to the optimizer: https://godbolt.org/z/WeS3c-

andrewrk commented 5 years ago

Yeah - that's an important consideration. Although we may have to rewrite code that currently uses the "interface field" pattern to use whatever solution we come up with, I think we can rely on that in the future, abstracting in this way will be recognized by the optimizer. We can rely on it because that is one of the explicit goals of the solution.

Hejsil commented 5 years ago

One more thing. I did some testing on a different way of doing "interfaces" in userland. My tests show, that even a different userland approach is a lot friendlier to the optimizer: https://github.com/Hejsil/zig-interface

tgschultz commented 5 years ago

I've come to roughly the same conclusion as Hejsil. At least for my own use cases, comptime-dispatch is almost always what I want and runtime would be an edge case which can be supported easily enough with a wrapper. I had a different implementation, but compare the status quo: https://godbolt.org/z/mLSUwu with an alternative: https://godbolt.org/z/H8B86y.

As far as I can tell, there aren't actually any advantages to the status quo, especially in regards to Streams which are already comptime-dispatch by virtue of the ErrorSet requirements.

Hejsil commented 5 years ago

@tgschultz Also, the optimizer is better at optimizing struct { vtable: *const VTable, impl: *c_void } than interface fields: https://godbolt.org/resetlayout/WR4qHd

Edit: Ooh sorry, you show that in your links too :)

andrewrk commented 5 years ago

Huge thank you to @tgschultz for working around countless bugs and making the following proof-of-concepts, which demonstrate the userland alternatives to language changes for this key issue:

Being completely honest, I have not reviewed these pull requests yet. And they are not in mergeable states, due to merge conflicts. However, I am leaving this comment with links to them so that I will check them out when I go to solve this issue. That will be in a different release cycle, not 0.5.0. So I'm closing those pull requests but they are still pertinent to this issue.

cgrandfield commented 4 years ago

Started reading a bit about Zig today, seems like a really cool project.

Looks like comptime mostly provides monomorphism/generic support and that this ticket is for runtime polymorphism support where it seems the chosen direction seems to be some sort of comptime method to generate vtables.

One thing I wonder is if it might be possible to do runtime polymorphism without vtables by placing methods at fixed offsets.

Using the Rust Animal trait structure as an example https://doc.rust-lang.org/rust-by-example/trait.html if you had Horse, Cow, Sheep objects and implemented fn "name" at offset 0x000, fn "noise" at offset 0x100, fn "talk" at offset 0x200 from the beginning of the object's executable region you'd be able to invoke the name, noise, or talk methods on a Horse/Cow/Sheep without doing a vtable lookup - instead you could just call into the beginning of the respective objects executable region + the offset. A potential intermediate level of indirection between comptime monomorphism and vtable lookup.

Not totally sure such an approach could work or provide much advantage (you still need to get the start of the object's execution region), but it does seem like the sort of low level wizardry Zig is grappling with. Regardless, good luck with the language.

adontz commented 4 years ago

@andrewrk

One thing I want to do before closing this issue is create a GUI application with widgets and stuff. That's the classic use case for OOP so we can test out how well the vtable abstraction works.

Pleasant to develop GUI requires data binding and run-time composition, both are huge pain without some kind of rich reflection like in .Net and Java, or at least like in Qt/MFC. Also, Qt/MFC use a lot of ugly macros and still are much inferior to WPF/XAML. I don't think Zig is a good fit for such kind of task. Actually I'm pretty sure it's misfit by modern standards, which is not bad, just state of things. I did not see new MFC projects started for years and Qt shifted focus to QML/JavaScript.

MIME parsing is friendly to OOP task, various MIME headers can be subclasses of base header, various MIME parts can be subsclasses of base part. MIME will be required at some point of time for HTTP or email anyway. I wrote MIME parsers in C++ and C#. Maybe I could write it in Zig with some help.

Just please, don't do multiple inheritance :-)

Also, maybe aggregation can be easier?

const A = struct {
  fn doSomething(...) ... {};
}

const B = struct {
  a: A;
  fn doSomething = a.doSomething; // Note not declaring parameters or return value. Parameter forwarding code will be automatically generated.
}
Sobeston commented 4 years ago

Just please, don't do multiple inheritance :-)

Don't worry, there won't be singular either

adontz commented 4 years ago

Just please, don't do multiple inheritance :-) @Sobeston

Don't worry, there won't be singular either

Why whould you need a vtable if a method cannot be overriden?

ghost commented 4 years ago

Why whould you need a vtable if a method cannot be overriden?

vtables are not an exclusive feature of inheritance, they are just one way of doing dynamic dispatch. Just as useful for implementing traits/interfaces.

marnix commented 4 years ago

In one of @andrewrk's live streams I recently binge-watched, this issue was briefly discussed, and I got interested. Allow me to try and contribute, by adding some knowledge I picked up over the years, which might be relevant to this issue. Let me also try to untangle some concepts that are often packaged up together (like inheritance and dynamic dispatch).

(Warning, long comment ahead. I am a Zig noob, and a programming languages amateur. So grains of salt apply. But over the years I did read a fair amount of language design papers. See relevant references at the end of this comment.)

Let me first give the terms I'll be using; hopefully I'm using them correctly.

The semantics

The core idea / suggestion is the following. One can think of the methods of a virtual function, as each having a predicate or condition on the parameters (like self isa Circle). And conditions can be compared to each other, in a partial order, in the sense that self isa Circle implies self isa Shape for all self, or x >= 10 implies x > 0 for all x, or b: u8 implies b: i32, for all b; in every case the former predicate is more specific than the latter. So the suggestion is to do dynamic dispatch by (1) finding out which methods apply for a given parameters tuple; (2) choosing the one with the unique most-specific predicate; and (3) failing at compile-time in case zero or multiple methods could be chosen.

This approach is called predicate dispatch. Note that this is a symmetrical way of doing dynamic dispatch: it treats all virtual function parameters equally. And using only the first parameter with subtyping, so self isa X, in the predicate results in traditional single dispatch; and using isa on multiple parameters gives multiple dispatch. So this is a generalization of multimethods.

Here is an example, using new hypothetical syntax marking methods with if:

fn fizzbuzz(n: u8) if (true) anyerror!void { // method #1
    try stdout.print("{}", .{n});
}
fn fizzbuzz(n: u8) if (n % 3 == 0) anyerror!void { // method #2
    try stdout.print("Fizz", .{});
}
fn fizzbuzz(n: u8) if (n % 5 == 0) anyerror!void { // method #3
    try stdout.print("Buzz", .{});
}
fn fizzbuzz(n: u8) if (n % 3 == 0 && n % 5 == 0) anyerror!void { // method #4
    try stdout.print("FizzBuzz", .{});
}

Note how we have no traits or whatever, but still there is a notion of dynamic dispatch: Every call to fizzbuzz() chooses one of methods. And it chooses only between methods that match the predicate.

Method #4 takes precedence over #2 and #3 (it "overrides" them), since, e.g., for all n, n % 3 == 0 && n % 5 == 0 implies n % 3 == 0; and similarly methods #2 and #3 each override #1.

If the method #4 were left out, then there are parameter values (e.g., 30, actually all n for which n % 3 == 0 && n % 5 == 0) where is there no unique most-specific method to be chosen: both #2 and #3 apply and neither is overrides the other. This should be a compile error.

If the method #4 were to use the equivalent predicate n % 15 == 0, then the compiler probably won't be smart enough to deduce that, e.g., n % 15 == 0 implies n % 5 == 0, so also in this case it would report that there is no method covering the n % 3 == 0 && n % 5 == 0 intersection.

Finally, adding more hypothetical syntax, one could perhaps use super to call an overridden method, and (in this case) one would have to explicitly choose between multiple overridden methods.

fn fizzbuzz(n: u8) if (n % 3 == 0 && n % 5 == 0) anyerror!void {
    try super[n % 3 == 0](n); // restricts to methods #1 and #2, always selects #2, prints "Fizz"
    try super[n % 5 == 0](n); // restricts to methods #1 and #3, always selects #3, prints "Buzz"
}

The implementation

Now about how this could be implemented. One way is 'vtables', where every value knows the list of methods that apply, one pointer-to-method per virtual function. And when, say the 7th virtual function is called, we do a indirect function call through the 7th pointer in that list/vtable.

These indirect function calls are notoriously difficult to optimize, requiring an analysis at each call site to see what values are possible, and optimizing if e.g. all values at that point are guaranteed to have the same vtable / list of pointers.

This 'vtables' approach corresponds to taking the following table row-by-row.

virtual_function_1 ... virtual_function_n
value_1 method ... method
value_2 method ... method
... ... ... ...
value_n method ... method

The key insight is that one can also take this table column-by-column.

That means that for every virtual function, we generate some code that does a couple of ifs on these methods' predicates, calling each of the methods in the various branches.

As an example, the above FizzBuzz code would become 4 ordinary functions, with a dispatch function that looks something like this:

fn fizzbuzz(n: u8) anyerror!void {
    if (n % 5 == 0) {
        if (n % 3 == 0) {
            try fizzbuzz_4(n);
        } else {
            try fizzbuzz_3(n);
        }
    } else {
        if (n % 3 == 0) {
            try fizzbuzz_2(n);
        } else {
            try fizzbuzz_1(n);
        }
    }
}

The great benefit here, it seems to me, is that that generated code is a lot easier to optimize, since the methods can trivially be inlined, and then all kinds of optimization can be done. (And the required compile-time knowledge about which predicates always-imply which, will lead to additional optimization opportunities.)

So ultimately the result would be equivalent to something like this:

fn fizzbuzz(n: u8) anyerror!void {
    var b3 = (n % 3 == 0);
    var b5 = (n % 5 == 0);
    if (b3) {
        try stdout.print("Fizz", .{});
    }
    if (b5) {
        try stdout.print("Buzz", .{});
    } else if (!b3) {
        try stdout.print("{}", .{n});
    }
}

(I don't know whether this alternative dual-to-vtables approach has a name in the literature.)

Summary

In summary, I think there are two messages in this comment. First, dynamic dispatch is separate from inheritance and interfaces and stuff, so one could already have the former without yet having the latter. Second, there is an alternative implementation to vtables, which seems more optimizer friendly, and which naturally allows multiple dispatch and even predicate dispatch.

(Oh, and I have no idea about how to implement this in Zig, and how much could be done in a library, and how much would need to go into the language, and whether some parts might benefit the language because they offer other new possibilities.)

References

The direct inspiration for this comment is the work done on predicate dispatch from the following papers:

as prototyped in the Cecil language (http://projectsweb.cs.washington.edu/research/projects/cecil/).

anosovic commented 4 years ago

how does the solution @marnix summarized solve the problem?

the reason it is desirable to have a function signature's implementation consist of multiple parts (methods) which may or may not exist in separate source files / modules / classes is to allow different programmers to write different implementations for the same api. classic oop example: you want to do logic with an animal generically, and developer A wrote a dog animal and developer B wrote a cat animal. with dynamic dispatch, you can talk to the dog code and the cat code in the same language: both animals speak, sleep, eat, etc and the burden is not on the programmer but the language to run developer A's code when a dog is used and developer B's code when a cat is used.

easy to see how this generalizes to @andrewrk's GUI widgets problem

without dynamic dispatch, at some point developer A and developer B would have to work on the same code, or the user would have to manually wire up each individual function for each api they talk to, in order to accomplish this feature for users of the animal api. this problem is why this thread exists, right? with dynamic dispatch, no one has to sort it out manually. each developer just conforms to the api spec and all users are instantly able to use it.

I wonder if we can maintain zig's goals of being a predictable language and also have dynamic dispatch. maybe, but does the solution @marnix summarized maintain that property completely?

is my summary of the problem not right? let me know!

alexnask commented 4 years ago

I will just reiterate my 2018 comment on this issue which is that we can implement this in zig itself.
Here is a package that introduces interface types similar to Rust traits, with more control over the ownership and storage of the implementation object: https://github.com/alexnask/interface.zig (split off from my std.interface PR)

This should be a lot more optimizer friendly (I will compile a list of examples soon :tm: with reimplementation of std interfaces) than the current pattern used in the stdlib, e.g. for Allocator, and allows the implementation types to be completely separate from the interface type.

data-man commented 4 years ago

There is interface-experiment branch. ;) I'll hope that this experiment will be implemented.

anosovic commented 4 years ago

additionally:

this (https://compileandrun.com/struggles-with-rust/) article is cited by @andrewrk in "Why Zig and Not Rust"?

the problem is:

[. . .] You can't add trait implementations of traits you don't own to types you don't own, and there is no workaround but newtypes. – Vladimir Matveev

does https://github.com/alexnask/interface.zig support this? does interface-experiment support this?

alexnask commented 4 years ago

@anosovic
interface.zig will automatically make a vtable from the implementation type when initializing an interface object, so the type only needs to provide functions compatible with what the interface expects (to be exact, the arguments except for self need to coerce from one signature to the other as well as the return type).
You can also provide a manually written vtable if the functions don't match exactly.

In interface-experiment it seems like you provide the vtable (or at least a tuple of fn pointers) yourself when using @implement so it also detaches interfaces/traits from the implementation types.

marnix commented 4 years ago

A quick note on the status of my earlier comment. I just wanted to provide information, not a solution.

I wanted to make clear that vtables are not the only way to do dynamic dispatch, and that this alternative is more optimizer-friendly.

This doesn't solve anything related to patterns for GUI-development, unless accompanied by some independent pattern or feature that allows to express 'X implements Y' or 'P is-a Q' (or perhaps even 'ExampleReader is-a Reader.IFace').

It also (on purpose) doesn't say anything about the interaction of separate methods with separate compilation.

Finally, I have no idea what smart tricks can be done to do part of 'predicate dispatch' outside of the language. I also don't know how you all want to trade off optimizer-friendly vs library-implementable vs easy-to-use.


So if you already have a mechanism like interface.zig, that allows values of different types to be used through a common interface, if those types cooperate, then that might be Good Enough(tm) to avoid the additional language complexity of a mechanism like predicate dispatch.

But then again, perhaps adding (some feature supporting) predicate dispatch is a small price to pay and unlocks nice new patterns with great optimization opportunities, for those people who say, "all those pointer indirections are too expensive for me, so I'll avoid those nice abstractions".

andrewrk commented 3 years ago

One thing I want to do before closing this issue is create a GUI application with widgets and stuff. That's the classic use case for OOP so we can test out how well the vtable abstraction works.

Hello, myself from 4 years ago. I have not done the thing you suggested for me to do regarding creating a GUI application with widgets and stuff. However I am confident in the language's ability to handle the use case without an additional language feature.

We still have #764 open which could cause this to get revisited. The generic reader/writer interface has its pros and cons. However, at this point the plan is to not add an additional dynamic dispatch language feature.

And so I am closing this proposal, with reasonable confidence in it not being re-opened.

flavius commented 3 years ago

@andrewrk It's not quite clear, are traits/oop/polymorphism/interfaces not on the radar at all, not within scope, or not yet technically implementable, but on the radar for 1.0.0, for 2.0.0?

I think that a page summing it all up as suggested here would be useful: https://github.com/ziglang/www.ziglang.org/issues/30

Personally, I think that having these features, even if opt-in, would make zig more palatable to busines-type of applications / web development, an area where rust comes rather with resistance, primarily because of performance concerns (which these kind of applications don't have). Without making this a discussion about rust: they do have the idea of TypeId, but it's not really useful in practice, maybe zig can fill that "niche"?

vitiral commented 1 year ago

I'm just reading up on Zig and really liking the language. I was curious about interfaces and have thought about this for my own minimalist language, so I just wanted to drop in a quick comment that the simplest possible V-table is something like below (in C)

// An Interface just a collection of methods and some data.
typedef struct {
  void (*add)(void* this, int a); // pointer to add method
  ... other methods
} MInterface;

typedef struct {
  const MInterface* m, // a pointer to methods
  void* d;         // a pointer to data of an unknown type
} Interface;

typedef struct {int a;} MyData;

void MyData_add(MyData* this, int a) {
  this->a += a;
}

const MInterface mData = (MInterface) {
  // Note: the C compiler doesn't actually let you do this and requires a cast
  .add = MyData_add,
};

Interface MyData_asInterface(MyData* d) { return (Interface) {.m = &mData, .d = d }; }

// Example use
MyData myData = (MyData) { .a = 4 };
Interface myInterface = MyData_toInterface(&myData);
myInterface.m->add(myInterface.d, 42);  // You probably make a macro for this
assert(46 == myData.a);

Note: this does not allow for a single variable implementing multiple interfaces -- something couldn't implement Debug and Eq. That's what a traditional V-table accomplishes! However, interface "single inheritance" could be done. If a sub-set of the interface's methods were identical to another interface, then the pointer conversion would be trivial. For instance, you could make:

typedef struct {
  void (*close)(void* d);
} MResource;

typedef struct {
  void (*read)(void *d);
  ...
  void (*close)(void* d);
} MFile;

And an MFile could be converted to an MResource by just bumping the pointer.

Obviously this isn't how you would represent inheritance in the syntax -- likely you would include MResource inside of MFile to make the inheritance clear.

vitiral commented 1 year ago

And... it looks like that's basically how things are implemented, if I'm reading things right:

https://github.com/ziglang/zig/blob/master/lib/std/mem/Allocator.zig

CodePagesNet commented 2 months ago

Only allow v-tables that use one level of indirection at the most! v-tables should be as close to the performance as a single function pointer as possible. Please! Don't be yet another language that messes this up. https://engineering.purdue.edu/tgrogers/publication/zhang-asplos-2021/