Closed ninegua closed 9 years ago
There are non-constant global variables. Storage locations for such variables are resolved early, but their values (including type) can change at run time.
Given a Symbol or SymbolNode in an AST, you need to look at the function's list of local and closed variables. If the symbol does not appear there, static parameters are examined next. These are stored in the sparams
field of LambdaStaticData
. If the symbol isn't there either, then it is a global variable available in the module referenced by the module
field of the function's LambdaStaticData
object.
Using that procedure, I believe all variable references can be resolved. In case I haven't understood, it would help to provide an example of an "unresolved" name.
I'm not saying that names cannot be resolved following the procedure you described, I'm suggesting that all names (of Symbol type) be resolved to either SymbolNode or GetfieldNode when Julia first produce typed ASTs, so that no Symbol remains in the them (except in Expr head positions).
Since type inference already can find out exact location (which module) of a name, it can save future effort of repeating this process, especially when dealing with situations where in a AST rewriting function, we don't know which module a piece of AST comes from. It is also quite troublesome to safely copy ASTs from one place to another.
The current implementation is not very consistent about resolving names. For example:
julia> f(x,y)=sin(x) * sin(y)
f (generic function with 2 methods)
julia> code_typed(f, (Array{Float32,1}, Array{Float32,1}))
1-element Array{Any,1}:
:($(Expr(:lambda, Any[:x,:y], Any[Any[:_var0,:_var1,:_var8,:_var9,:_var10,:_var11,:_var12,:_var13,:_var14],Any[Any[:x,Array{Float32,1},0],Any[:y,Array{Float32,1},0],Any[:_var0,(Array{Float32,1},Array{Float32,1}),0],Any[:_var1,(Array{Float32,1},Array{Float32,1}),0],Any[:_var8,(Array{Float32,1},),0],Any[:_var9,(Int64,),18],Any[:_var10,Array{Float32,1},18],Any[:_var11,Array{Float32,1},18],Any[:_var12,Function,18],Any[:_var13,Array{Float32,1},18],Any[:_var14,Array{Float32,1},18]],Any[]], :(begin # none, line 1:
_var14 = sin(x::Array{Float32,1})::Array{Float32,1}
_var13 = sin(y::Array{Float32,1})::Array{Float32,1}
_var12 = (*)::Any
_var11 = _var14::Array{Float32,1}
_var10 = _var13::Array{Float32,1}
_var9 = (GetfieldNode(Base.Broadcast,:broadcast_shape,Any))(_var11::Array{Float32,1},_var10::Array{Float32,1})::(Int64,)
return broadcast!(_var12::F,(top(ccall))(:jl_new_array,$(Expr(:call1, :(top(apply_type)), :Array, Float32, 1)),$(Expr(:call1, :(top(tuple)), :Any, :Any)),Array{Float32,1},0,_var9::(Int64,),0)::Array{Float32,1},_var11::Array{Float32,1},_var10::Array{Float32,1})::Array{Float32,1}
end::Array{Float32,1}))))
We can see a few inconsistencies above:
Another example would be:
f(x)=copy(x)
g(x)=begin copy!(x)=x; copy(x) end
If we dump the typed AST of both, we can see that f contains an unresolved copy! call, while the AST of g contains a fully resolved copy! as GetfieldNode. So it appears that somehow name is only resolved to avoid conflict? Why not always resolve them?
PS: I don't quite understand the decision of allowing globals to change type, though this is a different topic. In an interactive shell, newer definitions of a non-function variable should shadow previous definitions, which I believe is a different case from running code defined in a fully enclosed module. If a user really want a variable's type to be anything, he/she should explicitly say it is of Any type.
The reason we do this is efficiency. The number of symbols over all ASTs is very large. If a function contains, say, 100 function calls, we would have to create 100 extra nodes that simply repeat the information that all of these names are found in the same module as the enclosing function. Instead we can just refer to an AST, and a single pointer to a module. It's much more efficient to remember what module the AST came from than to repeat the information in every global symbol reference.
Also there is a function resolve_globals
in inference.jl that adds GetfieldNodes as needed to move an AST from one place to another. With suitable arguments (e.g. an empty destination module) it would qualify everything.
If GetfieldNode is immutable, then we can use techniques like hashconsing to reduce memory overhead. A similar situation is with type annotation, is it really necessary to introduce SymbolNode when Symbol's type can be looked up in the environment? There are always trade-offs, and more convincing arguments can be made by measuring real overheads.
SymbolNodes are used because a variable's type can be different at different points.
We keep lots of compiler data structures in memory during execution, and it's kind of a problem already. Better than hash-consing is to eliminate unnecessary objects entirely. Why is it so hard to keep track of what module an AST came from?
If you want I can write a small wrapper around resolve_globals
that will add GetfieldNodes everywhere.
Like I said, sometimes it is very difficult to figure out where a piece of code comes from. Here is a short example using macros:
julia> macro expand(expr)
@eval function g()
$(esc(expr))
end
return quote g() end
end
julia> module A
import ..@expand
sin(x) = x + 1.0f
f() = @expand sin(100)
end
julia> A.f()
ERROR: error compiling f: unsupported or misplaced expression "escape" in function f
julia> code_typed(A.f, ())
1-element Array{Any,1}:
:($(Expr(:lambda, Any[], Any[Any[],Any[],Any[]], :(begin # none, line 4: # none, line 5:
return $(Expr(:escape, :((GetfieldNode(Main,:sin,Any))(100))))
end::Any))))
If I understand it correctly, the purpose of using esc(x) is to allow the names in the AST (value of x) resolve in the call site of applying the macro. It is apparently not working (causing an error), and sin didn't resolve to the intended A.sin function.
I admit the example was abusing the macro facility a bit. But situations like this do come up often in our use cases because names in an AST are not always accompanied by their locations.
esc
only serves a purpose when it appears in the expression returned from a macro. It's used by the macro expander, and not anything else. Macros are not supposed to call eval
or have side effects; they're just supposed to return a transformed expression.
Once you have the A.f
object, you can find out what module the function is in.
Actually after thinking about it more, I'd argue that resolved names should stay in AST, and resolved types should stay out of AST. Once you know the location of a variable (local or module), it is easy to look up its type. But the reverse is not as easy, because you have to walk the module hierarchy and perform a lookup in every module you visit.
Also, after inference, type castings become explicit, so it is wasteful to annotate every variable occurrences with a type.
Also, after inference, type castings become explicit, so it is wasteful to annotate every variable occurrences with a type.
I'm pretty sure this is not true. What do you mean by "type castings"? Probably some SymbolNodes are redundant, but certainly not all. A julia local variable does not always have a fixed type; different occurrences can have different types.
In fact, we already avoid creating SymbolNodes in many cases, for example for constant globals, since their types can easily be looked up just as you say.
I don't believe you ever need to walk the module hierarchy yourself. A module and a symbol are sufficient, e.g. to call eval(module, sym)
. And if a bare symbol occurs in an AST, that means it will be looked up in the module in that AST's LambdaStaticData. Imports and so on are handled automatically. For example you can do eval(Main, :sin)
even though sin
is actually defined in Base
, not Main
.
Sorry, I wasn't very clear. I was talking about the cost of doing name resolution, not what Julia function one has to call to resolve names.
Like I said, if a local variable can be assigned different types within the same body, it should either be declared as an union type or Any type. Its use can be accompanied by an explicit casting (for now, you can think of a SymbolNode where the type is different than the declared type of this symbol).
If you are concerned about space usage, adding one GetfieldNode per definition (in a module) and just returning the same object in every successful lookup sounds to me like a reasonable solution. I doubt it will blow up space usage compared to the current approach. If you are willing to give it a try, we can find someone who has free cycles to do this kind of experiment, and back it up with real data.
Removing the SymbolNodes for variables that are always the same type is a good idea. I'll plan to do that.
I still don't see why the GetfieldNodes are needed. Given an AST's module, putting a GetfieldNode with that module on every symbol inside it seems equally silly to me as putting a SymbolNode with the same type on every use of a variable.
At least in the language implementations that I'm familiar with, both MLton and GHC do this in their ASTs. Local definitions have short names, but otherwise are always qualified with where it comes from. In addition to making things more consistent (name resolution is done once and for all), I think it is also a nice trade off between space (actually quite minimum if same name objects are reused) and efficiency (having to walk module hierarchy for a lookup). Overall it reduces complexity of working with ASTs.
I understand that Julia strives to bring JIT performance, and perhaps it has not required sophisticated transformations over typed ASTs.
Is the problem that you want the information at macro-expansion time?
Yes, we are abusing macros to quickly generate functions like the example above, and want to be safe at handling AST pieces going into it.
Whether we should have all names resolved at macro expansion time is kind of debatable. But I do think it is a bit more hygiene, especially when used with esc.
I think you should try to find a different approach than using eval
in a macro. This gets very tricky. For example the argument to a macro might contain un-expanded macro calls, and you can't be sure what those symbols will ultimately refer to without doing the expansions. Before the lowering pass, it is even difficult to tell what will be a local or global variable. Macro-expand time might simply be too early for the kind of analysis you want to do.
As an example, we might have
function f()
@expand sin(100)
@foo sin
end
where @foo sin
can expand into sin = x
, making sin
a local variable. This makes it very hard for @expand
to reason about what sin
in its argument means.
Points taken. Name resolution should be after lowering. Actually the perfect place is in type inference, because it has to resolve all names anyway to figure out their definitions and types.
In referring to the specific example of using esc in macros that I gave above, a solution would be to keep track of the location (call site of a macro application) in the :escape Expr, so that at the type inference stage, it knows where to look up the escaping names.
IIUC, in your example @expand
macro you want g() to contain A.sin
. However that assumes that sin
inside f
refers to the global sin
, but my example shows this is not necessarily so. I believe you will always have this problem if you try to move macro argument expressions out of their original context, into the separate g()
function.
It indeed depends on what should be the intended semantics of using esc with a macro argument expression out of its original context. Here is my attempt at defining this behavior. When used in a left hand side like you have pointed out, maybe it should not resolve at all and become a local variable, maybe it should just resolve and then become an error. When used in a non-LHS position, it resolve as if it were in the original context and again two situations would follow: (1) if the name resolves to a local definition in the original context, it can either become an error or cause the local definition to become an environment variable (as if captured by a closure); (2) if the name resolves to module level definition, it becomes a GetfieldNode.
Actually thinking about it again, in the LHS case, it should always resolve to either a local or global definition in the original context, the same as in the non-LHS case. It is just that when it resolves to a local, the original definition should now become something that can be captured by closures. Now, to implement such semantics may not be very straightforward, and it is pretty much a corner case anyway.
I modified the example to not use @eval
, but it is still not supported:
macro expand(expr)
quote
function g()
$(esc(expr))
end
g()
end
end
module A
import ..@expand
sin(x) = x + 1.0f
f() = @expand sin(100)
end
julia> A.f()
ERROR: MethodError: `*` has no method matching *(::Float64, ::Function)
Closest candidates are:
*(::Any, ::Any, ::Any)
*(::Any, ::Any, ::Any, ::Any...)
*(::Number, ::Bool)
...
in #3#g at /mnt/home/hliu54/PSE/tmp/julia/test.jl:4
in f at /mnt/home/hliu54/PSE/tmp/julia/test.jl:6
julia> code_typed(A.f, ())
1-element Array{Any,1}:
:($(Expr(:lambda, Any[], Any[Any[symbol("#3#g")],Any[Any[symbol("#3#g"),Function,2]],Any[]], :(begin # /mnt/home/hliu54/PSE/tmp/julia/test.jl, line 13: # /mnt/home/hliu54/PSE/tmp/julia/test.jl, line 3:
$(Expr(:method, symbol("#3#g"), :((top(tuple))((top(tuple))()::Any,(top(tuple))()::Any)::Any), AST(:($(Expr(:lambda, Any[], Any[Any[],Any[],Any[]], :(begin # /mnt/home/hliu54/PSE/tmp/julia/test.jl, line 4:
return sin(100)::Any
end::Any))))), false)) # line 6:
return (#3#g::F)()::Any
end::Any))))
If I remove the esc, it becomes more interesting:
julia> code_typed(A.f, ())
1-element Array{Any,1}:
:($(Expr(:lambda, Any[], Any[Any[symbol("#3#g")],Any[Any[symbol("#3#g"),Function,2]],Any[]], :(begin # /mnt/home/hliu54/PSE/tmp/julia/test.jl, line 13: # /mnt/home/hliu54/PSE/tmp/julia/test.jl, line 3:
$(Expr(:method, symbol("#3#g"), :((top(tuple))((top(tuple))()::Any,(top(tuple))()::Any)::Any), AST(:($(Expr(:lambda, Any[], Any[Any[],Any[],Any[]], :(begin # /mnt/home/hliu54/PSE/tmp/julia/test.jl, line 4:
return ((top(getfield))(Main,:sin)::Any)(100)::Any
end::Any))))), false)) # line 6:
return (#3#g::F)()::Any
end::Any))))
julia> A.f()
-0.5063656411097588
Apparently it didn't resolve sin to A.sin. Is it a bug? I believe macro expansions are allowed to contain local function definitions, but somehow the name resolution is messed up when splicing ASTs into their body. UPDATE: my bad, this is not a bug (the one above it is), because without esc, names should resolve with respect to the macro's environment.
1.0f
is 1.0 * f
. Maybe you mean 1.0f0
(Float32 1.0) instead?
My bad. So it is really only a problem when used out of context, like @eval
.
Anyway, I hope Julia developers can consider the option of having name resolved once and for all.
@ninegua Please be careful when talking about macros on github; writing @eval
without the backticks will actually notify the github user "eval" that he/she is being mentioned here. Just always put backticks around macros.
I think having names fully resolved at macro-expand time is inherently incompatible with the kind of macros we have.
@quinnj I didn't know about this, thanks for the tip.
@JeffBezanson I wasn't saying at macro expansion time, I meant it to be during type inference, and all typed ASTs will then have either fully resolved names, or locally defined names.
Ok, how about the utility function I offered (wrapping resolve_globals
) to convert an AST to that form?
Though I still don't really see the problem. I agree with your argument about type annotations, but I think the same argument applies to GetfieldNodes. Somebody else might find it very convenient to have the types marked everywhere, to save a lookup step. Given a symbol, you will have to do a lookup to find its type. By doing that, you also discover whether it is a local variable. If it's not a local variable, you know what module to look in. I just don't think this procedure is difficult or expensive.
To be clear, when you see sin
in the AST, its location is specified by the module
field of the LambdaStaticData the AST came from. Any time that doesn't hold, there will be a GetfieldNode already.
Like I said, when we need to put together pieces of ASTs (as in Expr, not in LambdaStaticData) from multi-sources into a single place, We either have to carry around a module name for each of them (and thus no longer fit into Expr) or have to call resolve_globals
before putting them together. If we go through multiple iterations of this, it is hard to keep track which pieces are already resolved, so we will end up calling resolve_globals unnecessarily. Do you see the complication here?
On whether to use SymbolNode or Symbol for local variables, I actually prefer SymbolNode, if every use of it can pointer to the same (immutable) object. It trades some space in saving the effort to track typing environment. This is why I argued that after type inference and name resolution there should be no Symbol left in AST except at Expr :head
position.
Ok, I understand not wanting to call resolve_globals
repeatedly. With your design, it looks like we could delete that function, and maybe some other code too, which is nice. One reason it didn't seem like a problem to me is that you already need to do renaming when inlining, so an extra AST pass can't be avoided.
Also I should rename GetfieldNode
to something like ModuleRef
. It was originally supposed to be more general, but ended up being used only for qualifying global variables.
Let's try it and see what the cost is. I like the idea of "paying" for it by removing lots of redundant SymbolNodes. Hash-consing is not too appealing here, since there is a lot of complexity in maintaining those global data structures (especially when you add concurrency).
Does it make sense to call them something like GlobalRef
and LocalRef
instead? The old "name it what you call it" rule of thumb.
I really don't want to wrap every local variable occurrence. For instance we also have the GenSym
type for more efficient compiler-added SSA variables, and they are always local.
However what we could do is rename GenSym
to LocalVar
, and use it for all locals, replacing all symbols with indexes early on. A few LocalVars would need to be wrapped with a HasType
node to handle variables with changing types.
If there's going to be some effort put into this, is there any potential synergy in addressing #1334 simultaneously? At one point there was the suggestion that we should just include complete function/file info for each line number, rather than only for the first line.
I think it's separate from line numbers, but very much related to #9023 (variable renaming).
Thanks, Jeff! I'm trying out the latest master, and I'm glad to see the changes in typed ASTs. However, I notice the following expression is still not named resolved:
julia> f(x)= x .* 0.5
f (generic function with 1 method)
julia> code_typed(f, (Array{Int,1},))
1-element Array{Any,1}:
:($(Expr(:lambda, Any[:x], Any[Any[Any[:x,Array{Int64,1},0]],Any[],Any[],Any[]], :(begin # none, line 1:
return x::Array{Int64,1} .* 0.5::Array{Float64,1}
end::Array{Float64,1}))))
Would appreciate if . can be resolved to Base module. Actually a related question that I have is, how do you write something like methods(Base..) with correct syntax?
EDIT:
After a closer look, actually the above . is pretty printed as ., but the real AST is Main... Why is it not Base..?
I decided to print operators as just their names, since otherwise the printed ASTs are hard to read.
I admit it's awkward, but you can write Base.(:(.*))
.
I'd like to resolve it to Base..*, but currently in some cases this forces bindings to be resolved too early. I'll keep working on it.
It looks like resolution for operators is different than normal names. Here is a program that demonstrate this issue. I attempt to re-define sin
and -
(negation) in a separate module:
julia> module Test
for f in (:sin, :-)
g = QuoteNode(f)
@eval function ($f)(x::Float64)
getfield(Base, $g)(x)
end
end
end
Warning: Method definition -(Float64) in module Base at float.jl:201 overwritten in module Test at none:5.
Test
julia> -(10.0)
ERROR: StackOverflowError:
in - at none:5 (repeats 79996 times)
julia> Test.(:(-))(10.0)
ERROR: StackOverflowError:
in - at none:5 (repeats 79996 times)
julia> sin(10.0)
-0.5440211108893698
julia> Test.sin(10.0)
-0.5440211108893698
julia> code_typed(Test.sin, (Float64,))
1-element Array{Any,1}:
:($(Expr(:lambda, Any[:x], Any[Any[Any[:x,Float64,0]],Any[],Any[],Any[]], :(begin # none, line 5:
return ((Test.getfield)(Test.Base,:sin))(x::Float64)
end))))
julia> code_typed(Test.(:(-)), (Float64,))
1-element Array{Any,1}:
:($(Expr(:lambda, Any[:x], Any[Any[Any[:x,Float64,0]],Any[],Any[],Any[]], :(begin # none, line 5:
return ((Test.getfield)(Test.Base,:-))(x::Float64)
end))))
As seen above, the definition of -
in Test module overrides top-level definition (which then creates an infinite loop when called), but this is not the case for sin
(which is what I'd expect).
An additional question is, why isn't getfield
resolved to top(getfield)
? There is obviously no function named getfield in either Test module, or Base.
This is because modules do importall Base.Operators
by default. This behavior is slightly controversial, but it's there because people basically always want to extend operators rather than define them anew.
The AST contains GlobalRef(Test, :getfield)
because at a certain point, all we know is that the variable reference occurs in the Test
module. We don't yet always know where this variable will ultimately resolve to. At some point, evaluating that GlobalRef will give you the correct value, however doing so also forces the binding to be resolved. There is a small possibility that the module was going to do e.g. a using
statement that would have changed where the variable resolves to.
Thanks for the explanation. I'd appreciate if there is a way to turn off importall
for operators so that we can define operators that reside locally in a module rather than overriding existing definitions in Base.
On the second issue, I believe here is what you meant:
julia> module Test
module B
export g
function g(x)
println(x, " world!")
end
end
module A
function f()
g("hello")
end
println(code_typed(A.f, ()))
using ..B
end
end
Any[:($(Expr(:lambda, Any[], Any[Any[],Any[],Any[],Any[]], :(begin # none, line 10:
return (Test.A.g)("hello")
end))))]
Test
julia> Test.A.f()
hello world!
At printing, the AST of Test.A.f
contains a call to Test.A.g
, which is yet to be defined. Once the code for module Test.A
is completely loaded, we can safely resolve Test.A.g
to Test.B.g
.
So, am I correct to assume that once a module is completely loaded, evaluating all GlobalRef
in the ASTs of this module should give me the final resolution? Do we have a function that does this? I'm thinking something of type GlobalRef -> Union(GlobalRef, TopNode)
would do.
Paul,
Check out baremodules: http://docs.julialang.org/en/latest/manual/modules/#default-top-level-definitions-and-bare-modules For cases where you need just the bare minimum.
-Jacob
On Tue, Jul 14, 2015 at 1:18 PM, Paul Liu notifications@github.com wrote:
Thanks for the explanation. I'd appreciate if there is a way to turn off importall for operators so that we can define operators that reside locally in a module rather than overriding existing definitions in Base.
On the second issue, I believe here is what you meant:
julia> module Test module B export g function g(x) println(x, " world!") end end module A function f() g("hello") end println(code_typed(A.f, ())) using ..B end end Any[:($(Expr(:lambda, Any[], Any[Any[],Any[],Any[],Any[]], :(begin # none, line 10: return (Test.A.g)("hello") end))))] Test
julia> Test.A.f() hello world!
At printing, the AST of Test.A.f contains a call to Test.A.g, which is yet to be defined. Once the code for module Test.A is completely loaded, we can safely resolve Test.A.g to Test.B.g.
So, am I correct to assume that once a module is completely loaded, evaluating all GlobalRef in the ASTs of this module should give me the final resolution? Do we have a function that does this? I'm thinking something of type GlobalRef -> Union(GlobalRef, TopNode) would do.
— Reply to this email directly or view it on GitHub https://github.com/JuliaLang/julia/issues/10403#issuecomment-121346740.
@quinnj Thanks for the suggestion, Here is what I tried:
julia> baremodule Test
import Base, Base.warn, Base.call, Base.@eval
export sin, -
eval(x) = Core.eval(Test, x)
for f in (:sin, :-)
g = QuoteNode(f)
@eval function ($f)(x::Float64)
warn("in Test!")
getfield(Base, $g)(x)
end
end
end
Test
julia> importall Test
Warning: ignoring conflicting import of Test.- into Main
julia> sin(10.0)
WARNING: in Test!
-0.5440211108893698
julia> -(10.0)
-10.0
julia> Test.sin(10.0)
WARNING: in Test!
-0.5440211108893698
julia> Test.(:(-))(10.0)
WARNING: in Test!
-10.0
So I've managed to define a -
operator in Test module that doesn't override its default definition in Base, but neither can I import it and run it without specifically writing Test.(:(-))
. Any suggestion in getting around this?
I'm trying eval(GlobalRef(...))
, it's close to what I wanted. Let me rephrase what I want to achieve:
Both points can be illustrated by the following program:
julia> baremodule Test
import Base, Base.warn, Base.call, Base.@eval, Base.@noinline
export sin, -, abc
eval(x) = Core.eval(Test, x)
for f in (:sin, :-)
g = QuoteNode(f)
@eval @noinline function ($f)(x::Float64)
warn("in Test!")
getfield(Base, $g)(x)
end
end
abc = 10
end
Test
julia> module Prog
importall ..Test
f() = sin(0.5) + abc
g() = -(0.5)
end
Warning: ignoring conflicting import of Test.- into Prog
Prog
julia> code_typed(Prog.g, ())
1-element Array{Any,1}:
:($(Expr(:lambda, Any[], Any[Any[],Any[],Any[],Any[]], :(begin # none, line 4:
return (Base.box)(Base.Float64,(Base.neg_float)(0.5))::Float64
end::Float64))))
julia> code_typed(Prog.f, ())
1-element Array{Any,1}:
:($(Expr(:lambda, Any[], Any[Any[],Any[],Any[],Any[]], :(begin # none, line 3:
return (Prog.sin)(0.5) + Prog.abc
end))))
We can see that sin
is resolved to Prog.sin
, but -
is somehow resolved to Base.(:(-)) (and inlined). I guess the warning indicates something is preventing this from happening. I guess the solution is to make Prog
also a baremodule, but this would discourage people from using operators defined in modules other than Base (but with the same name as those in Base). Is there a possible fix for this?
What I had hoped for was to resolve both to module Test
, but as explained by Jeff, I now understand it cannot be done until compile time. That being said, I can always find the real location (of a GlobalRef to a function) like below:
julia> eval(code_typed(Prog.f, ())[1].args[3].args[2].args[1].args[1]).env.defs.func.code.module
Test
julia> eval(code_typed(Prog.f, ())[1].args[3].args[2].args[1].args[1]).env.defs.func.code.name
:sin
For non-functions such as the use of abc
in function Prog.f
, I'd appreciate a way to find their exact locations.
The fact that Julia's name resolution is only "final" until compile time is also a bit surprising. I don't know if there is a solution to make everybody happy, but the following demonstrates what is going on:
julia> module A
export abc
abc = 10
end
A
julia>
julia> module B
using ..A
f() = abc
println(f())
abc = 20
end
10
Warning: imported binding for abc overwritten in module B
B
julia>
julia> module C
using ..A
f() = abc
abc = 20
println(f())
end
20
C
I find it intriguing that there is a warning for module B, but not for module C.
Actually here is what happens when I use using
instead of importall
:
julia> module Prog
using ..Test
f() = sin(0.5) + abc
end
Warning: using Test.- in module Prog conflicts with an existing identifier.
Prog
julia> Prog.f()
Warning: both Test and Base export "sin"; uses of it in module Prog must be qualified
ERROR: UndefVarError: sin not defined
in f at none:3
If I understand it correctly, the intended use of import
and importall
is the ability to extend functions that are being imported. But what happens when the imported names are a conflict with existing names? This wasn't explained in documentation.
The same goes to using
. The above gives a warning message and eventually an error when Prog.f
is called. However, changing using
to importall
would make the same call successful. I'd appreciate some clarification on this difference in behavior, when both are dealing with name conflicts between what was already defined in current scope and what is defined in the module to be used or imported.
Just want to add a comment that I really do think the name resolution should be final at type inference time (instead of compile time in codgen), otherwise we risk the danger of resolving it to a different type. The follow shows this bug:
julia> module A
export abc
@noinline abc(x::Any) = "hello"
end
A
julia>
julia> module B
import ..A.abc
f() = begin x = abc(1) end
println(code_typed(f,()))
@noinline abc(x::Int) = x
end
Any[:($(Expr(:lambda, Any[], Any[Any[Any[:x,ASCIIString,18]],Any[],Any[ASCIIString],Any[]], :(begin # none, line 3: # none, line 3:
GenSym(0) = (B.abc)(1)::ASCIIString
x = GenSym(0)
return GenSym(0)
end::ASCIIString))))]
B
julia>
julia> B.f()
signal (11): Segmentation fault
One may argue that the above is not a valid program because function abc
is being extended after I force the type inference on function f
, and the inferred String type caused a conflict with the actual method B.abc
that is being called when f
is run. However, the following works fine:
julia> module A
export abc
@noinline abc(x::Any) = "hello"
end
A
julia>
julia> module B
import ..A.abc
f() = begin x = abc(1) end
f()
@noinline abc(x::Int) = x
end
B
julia>
julia> B.f()
"hello"
So instead of merely doing type inference, I forced compilation of f
, then it stays to permanently resolve function abc
to the method defined in module A
.
I might be splitting the hair here, but the above illustrates a name resolution gap between inference and codegen. The thing caused the crash is because type inference is making an assumption that the call to abc
resolves to module A
when it infers its return type, but the generated code still has GlobalRef(B, :abc)
, which later during codegen locates a method defined in module B. This is a mismatch, and I would argue it really should produce GlobalRef(A, :abc)
because that is all the knowledge that the inferencer has at the time when it works on function f
. This would also match the behavior in the second example, so all would be fine.
I understand that forcing inference and compilation can sometimes result in surprising behaviors, but at the same time I hope I've just made the case that name resolution should be final at type inference time, so that GlobalRefs would never be resolved to different things when being compiled later. By "final", I mean it resolves to the exact place where the matching method is defined. This is where inferencer finds the type, and where codegen finds the method.
Before I raise the problem, please help clarifying two of my understandings:
If both the above are true, then the following shall be their logical consequences:
I put "statically" in quotes because Julia is certainly dynamic, but on the other hand, in the current implementation, once a Julia function is compiled, it stays compiled, and a compiled function does not attempt to resolve names at runtime.
With the above understanding, may I request the following feature to be implemented?
This will save a lot of trouble of implementing optimizations leveraging type information available in typed AST, because typed AST will no longer contain unresolved names, and all resolved names will only be of 3 forms: TopNode (for system definitions), GetfieldNode (for module-level definitions), SymbolNode (for local definitions). In all cases, it would be straightforward to obtain the type of a name.