nim-lang / Nim

Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
https://nim-lang.org
Other
16.59k stars 1.47k forks source link

Segfault during compilation of an overloaded call #827

Closed mrzv closed 9 years ago

mrzv commented 10 years ago

The following code snippet makes the compiler segfault. If you can think of a simple workaround, please, let me know.

import tables

type
    TMyTable[T] = TTable[int,T]

proc `[]=`[T](m: TMyTable[T], k: int, v: T) =
    tables.`[]=`[int,T](m,k,v)

var m: TMyTable[int] = initTable[int,int]()
m[2] = 3
zah commented 10 years ago

Introducing a type alias doesn't create the need to re-define existing operators (only distinct alias would require so).

This will work fine:

type
  TMyTable[T] = TTable[int,T]

var m: TMyTable[int] = initTable[int,int]()
m[2] = 3
zah commented 10 years ago

I've fixed the segmentation fault presented here, but there is another issue with the overload selection.

proc `[]=`[T](m: var TMyTable[T], k: int, v: T) =
  tables.`[]=`(m, k, v)

This code snippet still causes the local definition of []= to be picked, causing infinite recursion at run-time.

mrzv commented 10 years ago

So what's the right way to implement this? The example is over-trivialized. The point is that I'd like to perform a transformation on the key before storing it in the table. So imagine

proc `[]=`[T](m: var TMyTable[T], k: int, v: T) =
  tables.`[]=`(m, 2*k, v)

or in my actual case k is a pair, and I want to order the pair before storing it in the table.

zah commented 10 years ago

As long as we are discussing "the right way", from your description it sounds that what you really want is to define a special hash function and equals operator for your custom pair type. These should take care of ordering the pair and you'll be able to use the regular definition of []= from tables.nim.

zah commented 10 years ago

If you really want to use a custom Table type and you don't use distinct, you'll be able to override []= only on the basis of scope distance. (it will work only in the current module and you won't be able to export it properly for other users).

If you use a distinct type or a whole new type, then you can do whatever you want, but at that point all of the existing procs defined for TTable won't be valid for your type (there is a planned support for more lightweight distinct types in this regard).

mrzv commented 10 years ago

First of all, I get the new hash idea, and I might go with it, since it sounds like the simplest solution. Thanks for suggesting it.

But, in general, it's clearly a necessary use case, where one augments an existing data structure. The implementation details of the data structure should stay hidden, but one should be able to use it nonetheless. Lightweight distinct types may be the way to go. Alternatively, I guess I don't have to derive from TTable; I could have it as a member, and provide all the necessary functions.

I think this case of overloaded generic implementation is sufficiently general (it comes up in C++ quite frequently) that it would be good to have a "standard" way of addressing it.

zah commented 10 years ago

The idiomatic C++ way is just the same as my Nimrod suggestion. You augment STL by teaching it about new key types, not by redefining the lookup algorithms themselves.

mrzv commented 10 years ago

Well, in this very simple case, I agree the right way is to provide a new hash function. But imagine I'm creating a new data structure. Let's say I have implemented a binary search tree, and now I want to implement a splay tree on top of it. The find operation for the latter would first use the find operation for the former, and then rotate the node up to the root. In this case I should probably create a new type deriving from the binary tree type, and then inside its find function maybe cast the object to a binary tree, and then call find that way. Maybe? I'm not sure the right way to do this in Nimrod.

Now imagine we start talking about C++'s CRTP, then you need even more precise way to articulate which (generic) procedure one wants to call.

By the way, in my example of a table, where I want the keys to be unordered pairs, I obviously need to define a new hash function, but I'm guessing I also need to define the new == operator (making it oblivious to the element order in the tuple). Somehow overloading []= and [] seems a little simpler.

mrzv commented 10 years ago

By the way, in this case I can't even inherit from TTable to overload []= because it's "final." Which in turn begs the question, why is it final?

Araq commented 10 years ago

Using inheritence for abstract data types simply doesn't work. Never has, never will:

http://okmij.org/ftp/Computation/Subtyping/

zah commented 10 years ago

I'll give you the generic school-of-thought in Nimrod.

Let's look at AVL trees and Splay trees. They are both Trees in the sense that you can traverse them (similar to File Directories). They are also Ordered Trees in the sense that you can search items in them. If these statements are true, then there must be a type called Tree describing the parent/child relationships and offering various traversal algorithms plus a type called BinarySearchTree offering the search algorithms. In C++, one will be lured to create a hierarchy like this:

Tree -> BinarySearchTree -> AVLTree
 |-> Directory    |-> SplayTree

But clearly this would be silly, because the BinarySearchTree doesn't need the support for multiple dynamically allocated children the same way the Directory needs them. At this point you'll either start introducing new template parameters for the base Tree type such as ChildrenContainer and so on ad infinitum or you'll give up on the idea of sharing code.

In Nimrod, we don't model such relationships with inheritance, but rather with type classes. We say that the traversal algorithms are written against the Tree type class and the search algorithms are written against the BinarySearchTree type class. Then one must provide a minimum set of definitions to make the AVLTree, the SplayTree and the Directory satisfy the requirements of the correct type classes.

While inheritance allows the author of the code to pick a handful of base types for which the "is-a" relationship will hold, with type classes each type can belong to an "infinite" number of categories and users of the type may create new "is-a" relationships not envisioned by the original author.

All of this is not so foreign to C++, as the upcoming support for Concepts attempts to provide the same mechanisms.

mrzv commented 10 years ago

It would be great to have C++ concepts, and what Nimrod calls user-defined type classes is clearly going towards that. But unless I'm mistaken, it's not implemented yet, is it? (I can't compile on the devel branch the example with generic out of the manual.)

I think I get all of your points. They are all valid, and I appreciate you articulating them. I think the only thing that I'm trying to say is that right now there is an inconsistency with addressing/specifying which procedure one wants to call. tables.[]=(...) should unambiguously specify what it is I'm trying to call. Moreover, I really don't see what the difference is with the following example that compiles and runs fine:

# a.nim
proc f*[T](x: T) =
    echo "In a"

# b.nim
proc f[T](x: T) =
    echo "In b"
    a.f(x)

f(5)

What is the difference between f[T] and []=?

zah commented 10 years ago

Well, the reason this issue is not already closed is that indeed there is an incorrect overload selection as indicated in my second comment.

mrzv commented 10 years ago

Ah, I think I missed the fact that you do view it as a bug. Ok, we are in agreement then, and I'll stop commenting.

Araq commented 9 years ago

Overload resolution for 'module.[]=' now works properly.