JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.44k stars 5.46k forks source link

handle mutually-circular type declarations #269

Open JeffBezanson opened 12 years ago

JeffBezanson commented 12 years ago

Currently this doesn't work:

type Foo
    a::Bar
end

type Bar
    b::Foo
end

A nice way to handle it is to automatically insert forward declarations for all types defined in a file after lowering. That way as long as the types are in the same file it will Just Work™.

StefanKarpinski commented 12 years ago

We can also define it as wrong to declare mutually circular types beyond the span of a single file, which solves that problem. In the repl, should we do the same thing for every input expression? So that you can do this:

julia> begin
         type Foo
           a::Bar
         end
         type Bar
           b::Foo
         end
       end

It would be pretty janky to only be able to define certain types by loading a file.

StefanKarpinski commented 10 years ago

Aka mutually recursive type declarations.

filmackay commented 10 years ago

Is there a workaround for this - seems like a fairly fundamental gap at this point?

(other than Any-typing one of the references)

StefanKarpinski commented 10 years ago

The simplest workaround is to not declare a type on the circular field. Not awesome, but it works.

filmackay commented 10 years ago

Thanks. When using the above method you need to be careful to avoid runaways:

type A
       b
end

type B
       a::A
       function B()
         self = new()
         self.a = A(self)
         self
       end
end

b = B()

Since it then tries to display the value b it gets into an infinite loop:

B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(A(B(
...

Presumably these circulars would need to be avoided somehow in the repr? Dangerous for REPL/IJulia playing.

Would have thought this 'reference to my parent' pattern would be fairly common - obviously not.

quinnj commented 10 years ago

Do we need to do some kind of cycle detection in the default show for new types? Once a cycle is detected, we could do a custom show_recursive. Having a function like this would then be handy for people implementing their own mutually recursive types.

toivoh commented 10 years ago

Cycle detection in show would definitely be nice to have. I did an attempt in #1139, but it did not reach completion. Still, I think that the discussion there is useful, even though I'm not sure whether my approach was right.

toivoh commented 10 years ago

Another trick to get mutually circular types is to make the first type parametrized, and then create a type alias for it that is parametrized on the second type afterwards. But it's not very beautiful either.

amitmurthy commented 10 years ago

Yet another attempt for show(::Any) to handle recursive types in https://github.com/JuliaLang/julia/pull/3831

cuckookernel commented 10 years ago

Guys. Isn't the original issue of mutually recursive type declarations not yet resolved? Seems kind of weird, it's been a long time... Are there plans to have this feature in a forthcoming release?

StefanKarpinski commented 10 years ago

It hasn't been a high priority. Nice to have, but not a show-stopper for anything.

dhimmel commented 10 years ago

It hasn't been a high priority. Nice to have, but not a show-stopper for anything.

I have to disagree that this is not a show-stopper. Some would say it's even a bone-breaker. —Just one user's experience which will hopefully help prioritize development.

StefanKarpinski commented 10 years ago

Out of curiosity, what can't you do without this feature? Aside from the obvious – declare mutually dependent types.

kmsquire commented 10 years ago

@dhimmels, could you give a concrete example of where lack of this feature has broken a bone (or otherwise incapacitated a prospective Julia programmer)? ;-)

(Actual examples are usually more motivating than vague superlatives. Cheers!)

dhimmel commented 10 years ago

@StefanKarpinski, my usage is the obvious – mutually recursive type definitions.

@kmsquire, I am working on algorithms for graphs with multiple defined types of nodes and edges. Currently we've implemented the graph algorithms in python and machine learning in R but are considering julia for performance improvements and consolidating our code to a single language.

In the following example, I define types to represent nodes and edges. MetaNode and MetaEdge refer to the type of node or edge and are previously defined. The edges::Dict{MetaEdge, Set{Edge}} field for the Node type allows efficient lookup of edges of a specified kind (metaedge). Since this is an operation that is performed repeatedly, I desire the performance boost that type declaration provides.

type Node
    metanode::MetaNode
    edges::Dict{MetaEdge, Set{Edge}}
end

type Edge
    source::Node
    target::Node
    metaedge::MetaEdge
end

Thanks!

JeffBezanson commented 10 years ago

A workaround that might help is to access the fields via accessor functions:

edges(n::Node) = n.edges::Dict{MetaEdge,Set{Edge}}

This lets you write the type after both declarations. This function will almost certainly be fully inlined everywhere.

dhimmel commented 10 years ago

@JeffBezanson, @StefanKarpinski, Thanks for the elegant workaround. Julia is an entirely new way to program/think for me, and it sure is exciting.

quinnj commented 10 years ago

Just a note that with #3831 merged, we now handle show for circular reference types.

crsl4 commented 10 years ago

Hi, I am having similar issue. I am new to Julia and excited to use it. I work on graphs and I would like for example:

type Node index::Int32 edges::Array{Edge,1} end

type Edge index::Int32 nodes::Array{Node,1} end

I've managed to declare both types with the suggestions you mention above without errors anymore, but I want instances of Node and Edge to be linked: node1=Node(1,[]) node2=Node(2,[]) edge1=Edge(1,[node1,node2]) node1.edges=[edge1] node2.edges=[edge1]

When I try, I get the "#= circular reference =#" message. Everything seems to work fine, so I wonder if it is ok to ignore such message, or if there is a better way to deal with this. Thanks! I apologize if my code/notation is not the best, I am just beginning to learn Julia.

tknopp commented 10 years ago

@crsl4: Have you "Any"-typed one of the arrays? The circular reference message is harmless and only appears at the REPL when you do not put ; at the end of the line.

crsl4 commented 10 years ago

@tknopp thanks for your response! Yes, I "Any"-typed one of the arrays to be able to declare the types. I'll use ";" at the end of the lines. Thanks!

bencherian commented 9 years ago

I'm assuming that if you just don't type the mutually dependent yet to be declared type/immutable, then the data stored has an extra layer or indirection. I'm trying to make implement the Quad Edge data structure in Julia. I would like something like this


type DirEdge{DT}
    next::(QuadEdge{DT}, Uint8)
    data::DT
    function DirEdge(next::(QuadEdge{DT}, Uint8)) 
    newedge = new(next, data)
    newedge.next = next
    return newedge
    end

end

immutable QuadEdge{DT}
    record::Vector{DirEdge{DT}}
    function QuadEdge() # Implements MakeEdge
        qe = new(Array(DirEdge{DT},4))
        qe.record[1] = DirEdge{DT}((qe,1))
        qe.record[2] = DirEdge{DT}((qe,4))
        qe.record[3] = DirEdge{DT}((qe,3))
        qe.record[4] = DirEdge{DT}((qe,2))
        # Needs to be completed
    end
end

I could obviously have QuadEdge be a subtype of _QuadEdge and use the abstract supertype for next in DirEdge, but I believe that would result in DirEdge internally storing a pointer to the QuadEdge in next instead of the quad edge directly, which would hinder performance when traversing the subdivision.

Anyone have thoughts on how to avoid this penalty? Or are there clever ways of implementing a quad-edge in Julia that avoid the extra type but still nicely encapsulate the next pointer and stored data?

JeffBezanson commented 9 years ago

As it is, everything needs to be pointers anyway because of the circularity. Using an abstract type will not increase the amount of indirection.

If the Vector in QuadEdge always has 4 elements, it will be more efficient to use a tuple or declare 4 DirEdge fields inside QuadEdge.

bencherian commented 9 years ago

@JeffBezanson I went through my logic again and realized you are right about the pointers. The abstract supertype solution will do for now. It still feels like the ugliest code I've had to write yet in Julia. (That's not really saying much yet though...we'll see what I think in a few months.)

Edit: Deleted some other stuff as off topic...

ivarne commented 9 years ago

Please ask for advice on the julia-users mailing list. This question is way off topic, and easy to answer for anyone who know how to overload getindex() and setndex!()

JeffBezanson commented 9 years ago

You can access object fields by index instead of name using e.g. obj.(2) or obj.(i).

tieying12 commented 9 years ago

Coming from Haskell and now learning Julia I try to write some functions/algorithms for propositional logic. For propositional logic formulas we can define a data type Form (for formulas) in Haskell e.g. in the following way: data Form = Var String | Not Form | And Form Form | Or Form Form A natural way to do this in Julia would be using mutually recursive types. However I am still struggling with a solution for this. I also tried JeffBezansons workaround without success. I would appreciate any help.

cwiese commented 9 years ago

Seriously, what a pain not being able to cross reference to types! Perhaps in this functional typed world it seems correct, but practically it is a pain. Other languages let you forward declare.

It is not recursive for a type to be declared mutually between two type unless the referred instance it actually reference to an instance that is referring back to the same instance of another type. Right?

Other than this (and modules) I love JULIA and it is the best platform for our planning application - thank you for your hard work.

timholy commented 9 years ago

Perhaps this has already been proposed above, but in practice can't you just do this?

type Node{T}
    edges::Set{T}
end

type Edge{T}
    node1::T
    node2::T
end

That way all of your types are concrete, so there's no problem with type-instability---you'll get awesome performance with this. Yet you don't need forward-definition. If you want to restrict T, you can do it with outer constructors (not as definitive as inner constructors, but for consenting adults it should be fine).

Then you just need to define custom show(io::IO, node::Node) and show(io::IO, edge::Edge) methods to handle the circularity of display.

eschnett commented 9 years ago

I've tried to write a Julia interface to C code that defines several structure types containing pointers to each other, defining a tree. This is essentially the same case as data Form above, except that this isn't some "fancy Haskell" thingy, but rather some very down-to-earth C code.

(I "solved" this by using Any. Works like a charm, the code is short and easy to read. Yes, there may be a bit of an issue with type safety and performance, but then, I don't expect others to create such data structures, and the code wasn't performance critical either...)

crunch3 commented 8 years ago

So, the consensus is that TimHoly's suggestion covers it? This thread doesn't clarify the issue for me. It seems like what people have been asking for is:

type Node
   # Node attributes
    edges::Set{Edge}
end
type Edge
   # Edge attributes
    node1::Node
    node2::Node
end

whether Node & Edge are parametric or not. circular type definitions come up in stocks/options too...

type Stock
   # Stock attributes, e.g. price, volatility
  calls::Set{Option}
  puts::Set{Option}
end
type Option
   # Option attributes
  stk::Stock         # Option value() depends on stock price & volatility (and stock is invariant)
end

circular definitions are part of real life; what's the recommendation for modeling them in Julia?

cwiese commented 8 years ago

Since there is no concept of Forward Declaration - and we use Types heavily, our code is has many instances where the referred Type is Any instead of the specific Type we are declaring. Not pretty and we are not sure if we have performance issues due to this - we just make sure we declare the type in the function that used it.

mauro3 commented 8 years ago

@cwiese: Using kernel functions http://docs.julialang.org/en/release-0.4/manual/performance-tips/#separate-kernel-functions should help (no need to actually annotate it in the function though).

Maybe forward definitions will come (soon?): https://github.com/JuliaLang/julia/pull/6884#issuecomment-175865242

TransGirlCodes commented 8 years ago

Following the other PR threads, is this likely to come in 0.5 or 0.6?

JeffBezanson commented 8 years ago

0.5 is closed, but it may happen in 0.6.

TransGirlCodes commented 8 years ago

Thanks @JeffBezanson So what would you suggest is the best solution for now for my own edges and nodes scenario? I've seen the suggestion of using abstract types for fields above, or leaving the field unannotated and using getters with type annotation e.g. edges(n::Node) = n.edges::Dict{MetaEdge,Set{Edge}}. What are the performance pros and cons in each case? I remember reading in the docs that using type assertions doesn't really help the compiler/type inference, but then I wonder, since the fields are basically references to other composite types, leaving the field unannotated or as an abstract type won't matter, but I'm not sure about any of that.

JeffBezanson commented 8 years ago

If you put a type assertion on something that we can only infer a less specific type for, then it will absolutely help the compiler.

maxott commented 7 years ago

I just came across this thread for the same reason @dhimmel mentioned. I want to use Julia for computation on graphs. Mine goes a bit further as my nodes can have different distinct types, and (directed) links from one node to another are often type constrained. Using Any is an obvious hack to keep the compiler happy, but it utterly defeats the advantage strong typing gives you.

As I doubt that graphs are an esoteric use case, I would very much love to see some clean mechanism in a future version to allow circular type dependencies. If forward declaration is hard to implement, maybe there is an easier way to implement way to constraint the type of a field at a later stage.

dhimmel commented 7 years ago

Mine goes a bit further as my nodes can have different distinct types, and (directed) links from one node to another are often type constrained.

@maxott, depending on your use case and flexibility, you may want to check out the Neo4j graph database. In Neo4j, each node belongs to 0-or-more types (labels) and each relationship belongs to a single type. The cypher query language (a sort of SQL for networks) is great for operating on typed graphs. You can see a network I created in Neo4j at https://neo4j.het.io.

maxott commented 7 years ago

@dhimmel, thanks for the pointer to Neo4J, but I actually want to do some heavy calculations over graphs and that's why I'm interested in Julia.

wgreenr commented 7 years ago

@JeffBezanson +1 to have it in 0.6, absence of circular type declarations was a surprise

StefanKarpinski commented 7 years ago

Absent someone making a PR to implement this in the next 24 hours, it's not going to be in 0.6.

mohamed82008 commented 7 years ago

I think allowing the redefinition of types can solve this problem. This should allows us to first declare empty types then overwrite their definitions with the cyclic ones now that the error has been avoided. If this will cause performance issues, then the programmer can specify when he/she wants the type to remain constant from that point onwards, then a normal type can be used instead, or make the default case the constant type to make it backward compatible with all the codes out there. I hope my suggestion helps, coming from a person without much experience in C!

This gap in Julia is a bone breaker for me as well, as it makes me have to use a centralized graph type to keep track of the connectivity among edges and nodes by their ids, as opposed to the more straight forward de-centralized approach. This graph type makes me get away with one-way association, but it's alot of additional work for someone coming from Python. So please consider fixing this on top of your priority list; you don't want to miss out on all those network algorithm coders out there!

I also want to thank you all for your more than awesone achievement. Julia has truly renewed my faith in down-to-earth open-source scientific computing. Keep up the great work!

StefanKarpinski commented 7 years ago

The most straightforward solution here is to require all mutually recursive type definitions to occur in a single top-level expression, so you could write this:

begin
    mutable struct Foo #= uses Bar =# end
    mutable struct Bar #= uses Foo =# end
end

At some point in the future we could consider allowing this kind of thing inside of module syntax too, but that's not strictly necessary, so we may as well opt for the minimal viable solution here.

JeffBezanson commented 7 years ago

I believe this can be non-breaking, since all the affected cases would be errors now. Moving to 1.1.

djsegal commented 6 years ago

Where does the code that would have to be changed for this live?

rapus95 commented 6 years ago

I guess it won't make it into 0.7 either? :/

mauro3 commented 6 years ago

No, it's marked for 1.x. Note the work-around https://github.com/JuliaLang/julia/issues/269#issuecomment-68421745

Helveg commented 6 years ago

Here is a complete solution with outer constructors as proposed in https://github.com/JuliaLang/julia/issues/269#issuecomment-68421745:

type Node{T}
    Edges::Set{T}
end

type Edge{T}
    Nodes::Set{T}
end

function Node(edges::Set{Edge} = Set{Edge}(Edge[]))
    Node{Edge}(edges)
end

function Edge(nodes::Set{Node} = Set{Node}(Node[]))
    Edge{Node}(nodes)
end

This way objects of type Node or Edge can be initialised with:

node = Node()
edge = Edge(Set([node]))
push!(node.Edges, edge)
Deuxis commented 5 years ago

Real world example: Wayland API, which currently exists only as C libraries as far as I know, defines an interface struct that contains (a pointer to) an array of available messages, and defines a message as a struct that contains a pointer to an array of interfaces corresponding to object arguments to the message. Quoting the official header file:

struct wl_message {
    /** Message name */
    const char *name;
    /** Message signature */
    const char *signature;
    /** Object argument interfaces */
    const struct wl_interface **types;
};
struct wl_interface {
    /** Interface name */
    const char *name;
    /** Interface version */
    int version;
    /** Number of methods (requests) */
    int method_count;
    /** Method (request) signatures */
    const struct wl_message *methods;
    /** Number of events */
    int event_count;
    /** Event signatures */
    const struct wl_message *events;
};

This of course uses forward declarations at the beginning of the file. A far more elegant solution, which I think should satisfy everyone, is to make Julia's composite types lazy - validating their member variables and their types at object construction, not at declaration.