JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.69k stars 5.48k forks source link

disallow methods of local functions in different blocks #15602

Open wbhart opened 8 years ago

wbhart commented 8 years ago

In Julia 0.5.0-dev+3171 the following code creates a bogus duplicate function definition warning.

function myfun(n::Int)
     if n < 3
        doit() = 3
     else
        doit() = 5
     end 
end
JeffBezanson commented 8 years ago

This isn't allowed any more; I'll have to come up with a better error message. Now you need to do

function myfun(n::Int)
     if n < 3
        doit = ()->3
     else
        doit = ()->5
     end 
end
ChrisRackauckas commented 8 years ago

Is there a reason why this is no longer allowed?

JeffBezanson commented 8 years ago

Yes; this restriction makes it possible to hoist construction of f's method table to the top level. A clearer example is

function myfun()
    f() = 0
    if rand(Bool)
        f(x::Int) = 1
    end
    f(...)
end

Here, f would sometimes have one method and sometimes have two methods. That would require rebuilding f's method table on each call to myfun, which is too slow to be useful. The work-around is to use separate functions, as in my comment above, or

function myfun()
    f1() = 0
    f2() = f1()
    f2(x::Int) = 1
    f = rand(Bool) ? f2 : f1
    f(...)
end
ChrisRackauckas commented 8 years ago

Yeah, I found those workarounds. Thanks for explaining it. Now that I know there's a reason it's easier to remember.

I think that this should say something like Warning: functions cannot be conditionally defined. Last method definition is chosen for each dispatch. and then have a link to somewhere in documentation which elaborates more on it (I think there should be a whole page which elaborates on errors and warnings, especially for more obscure ones like this which are best explained via examples).

However, to make it a little more user friendly, my example

function g(a)
  if a
    f() = 2
  else
    f() = 3
  end
  return f
end
f = g(true)
f() # Returns 3

shouldn't error with g(false), but should just have the same answer as g(true) and give the same warning.

ViralBShah commented 8 years ago

Perhaps this should be documented in NEWS.md as @josefsachsconning pointed out.

josefsachsconning commented 8 years ago

Documentation pretty crucial, since this is a breaking change.

josefsachsconning commented 8 years ago

Would it not be better for this to generate an error instead of a warning?

StefanKarpinski commented 8 years ago

Would it be possible to automatically do the renaming that the workaround implements?

JeffBezanson commented 8 years ago

It's iffy; the general case is very complicated. For example

function f()
    if a
        f(x::A) = ...
    end
    if b
        f(x::B) = ...
    end
    if c
        f(x::C) = ...
    end
    ...
end

describes 2^n different functions.

JeffBezanson commented 8 years ago

Thinking about it some more, I could almost imagine treating definitions in different basic blocks like separate functions. In other words this:

function g(a)
  if a
    f() = 2
    f(x) = 4
  else
    f() = 3
  end
  return f
end

would return a function with two methods (returning 2 or 4) for g(true), and a function with one method (returning 3) for g(false). That would be backwards-compatible for cases like the OP. That very well might be the only viable option other than giving an error.

ChrisRackauckas commented 8 years ago

If you're treating definitions in different blocks like separate functions, wouldn't that break:

function g(a,b)
  if a
    f() = 2
  end
  if b
    f(x) = 4
  end
  f
end

as in, what would it return for g(true,true)? (Or am I misunderstanding it?)

JeffBezanson commented 8 years ago

That case is already broken. For now at least, our choices are to disallow it (since it doesn't do anything reasonable) or give it another meaning. Have you used that pattern? In my informal survey of packages during this change I don't think I found any uses of it, though there were several uses of the pattern in the OP.

For reference, my proposal would return a function with the one method f(x)=4 for g(true,true). It would be equivalent to

_f_1() = 2
_f_2(x) = 4
function g(a,b)
  if a
    f = _f_1
  end
  if b
    f = _f_2
  end
  f
end
StefanKarpinski commented 8 years ago

That seems extremely confusing. How about detecting if we're in a case that can be refactored like that while giving the expected behavior and error only if that's not the case? I.e. support the if/else case (which does occur) but raise an error in the if/if case (which seems not to).

ChrisRackauckas commented 8 years ago

@JeffBezanson I did have something like that before, i.e. adding dispatches conditionally. But I think one reason why you won't find any uses of it is because everything related to this issue has such non-intuitive behavior that any package is probably avoiding anything like this (for example, I changed some codes away from conditional definitions like this to all using anonymous function when I did the change from v0.4 to v0.5 a few months ago). I think that checking which packages are using the broken design pattern, seeing none are using it, and then declaring that it must not be wanted is kind of a self-fulfilling prophecy.

That said, I don't think it's a necessary design pattern. I've avoided it now for awhile and haven't needed it since. I like what @StefanKarpinski is suggesting: you know that the design pattern is going to have issues, so instead of letting odd behavior happen, throw an error. Otherwise it's one hell of a bug to debug. I'd rather be told to avoid it rather than try to make it work.

JeffBezanson commented 8 years ago

@ChrisRackauckas I did that survey while working on this change, before it was pushed to master, so it was 0.4 code only.

But yes, my default choice here would be to give an error (https://github.com/JuliaLang/julia/issues/15602#issuecomment-200410836), I just thought I'd throw out another alternative. Allowing the else case is an interesting possibility, but of course breaks under simple rewrites like

if x
   ...
end
if !x
   ...
end

I figured it might be easier to reason about what is in separate blocks, vs. which blocks the compiler can tell are mutually exclusive.

ChrisRackauckas commented 8 years ago

Got it. Thanks for the clarification.

cedeerwe commented 7 years ago

Just spent a couple of hours fixing a bug, which basically boils down to the problem described above:

function myfun(n::Int)
     if n < 3
        doit() = 3
     else
        doit() = 5
     end 
end

Currently, if I call myfun(1), the above process finishes without errors. On the other hand, calling myfun(5) results in UndefVarError: doit not defined. This is very confusing, as doit is clearly defined for n = 5. If this construction is not allowed, an error should be thrown in both cases. Alternatively, there should be at least a more verbose error message pointing out, that there is an overall problem as opposed to a problem in the else case only.

yuyichao commented 7 years ago

That seems to be a separate lowering bug.

bkamins commented 6 years ago

Another two similar examples (even more minimal):

function myfun()
     if true
        doit() = 3
     else
        doit() = 5
     end 
end

Now myfun()() produces 5. On the other hand:

function myfun()
     if false
        doit() = 3
     else
        doit() = 5
     end 
end

And myfun() errors.

jessebett commented 5 years ago

Just ran into this for the first time and had a very difficult time debugging it (with print line debugging):

const a=true
function blah()
    if a==true
        println("here")
        f_() = 1
        return f_
    elseif a==false
        println("there")
        f_() = 2
        return f_
    end
end
blah()()

This function returns:

here
2

Which makes it appear that it is only evaluating the first conditional block, and never the second, yet the function definition in the second is returned. Also, I had expected that since a is a const here it would be known during compile time and result in the correct conditional branch being compiled. Not the case.

ChrisRackauckas commented 5 years ago

The one thing that gets me about this is that they are now functionally different in v1.0. With anonymous functions, you can add dispatches in the global scope using a fun little hack:

julia> f = (u) -> u
#3 (generic function with 1 method)

julia> (::typeof(f))(u,v) = u*v

julia> f(1,2)
2

but you cannot do so in a function scope:

function foo()
    f = (u) -> u
    (::typeof(f))(u,v) = u*v
    f
end

which incentivizes the use of non-anonymous functions in local scopes like in DiffEqFlux:

https://github.com/JuliaDiffEq/DiffEqFlux.jl/blob/master/src/Flux/neural_de.jl#L4-L5

while I would like to stick to the nice and safe anonymous functions here (and indeed @jessebett uncovered this issue DiffEqFlux examples), I don't think I can. There probably is some underlying deep compiler reason for it, but I do remember that this was allowed in v0.6 so I'm not sure what happened to disallow it in v1.0.

JeffBezanson commented 5 years ago

Anonymous-ness does not make a difference here; the construct (::typeof(f))(x) = ... adds a globally-visible method and so isn't allowed in a local scope independent of whether f is an anonymous function. The way to think about it is that any anonymous function can be rewritten to a named function with a compiler-generated unique name. For example f = u -> u can be rewritten to

_f1337(u) = u
f = _f1337

That approach can be safely used to add multiple methods to an anonymous function. It's really just that there's no syntax for an anonymous function with multiple methods. One could write a macro for that though, e.g.

f = @multianon begin
    (u) -> u
    (u, v) -> 2u + v
end

With the DiffEq example, say you are tempted to write

function neural_ode(model,x,tspan,
                    args...;kwargs...)
  p = destructure(model)
  if condition
    dudt_(u::TrackedArray,p,t) = restructure(model,p)(u)
    dudt_(u::AbstractArray,p,t) = Flux.data(restructure(model,p)(u))
  else
    dudt_(u::TrackedArray,p,t) = something_else1
    dudt_(u::AbstractArray,p,t) = something_else2
  end
end

The correct way to write it is:

function neural_ode(model,x,tspan,
                    args...;kwargs...)
  p = destructure(model)
  dudt_1(u::TrackedArray,p,t) = restructure(model,p)(u)
  dudt_1(u::AbstractArray,p,t) = Flux.data(restructure(model,p)(u))
  dudt_2(u::TrackedArray,p,t) = something_else1
  dudt_2(u::AbstractArray,p,t) = something_else2
  if condition
    dudt_1
  else
    dudt_2
  end
end
antoine-levitt commented 5 years ago

Setting aside the question of how to support this better, can the obviously broken case be made to error? The example above by @bkamins is particularly bad; no warning at all is delivered, the if false version gives a strange error (doit not defined), and the if true version an incorrect result.

natschil commented 4 years ago

I agree with @antoine-levitt , the fact that code inside the else block of an if true construction can affect the outside scope is worthy of at least a warning. I just spent quite a lot of time trying trying to debug code due to this issue (as have other commenters here), this needs to be documented better.

StefanKarpinski commented 4 years ago

Agree. It would be great if someone took on figuring out how to make this warn in the problematic cases, or better still an error since it's unlikely to be doing the right thing in any code doing this.

daviehh commented 4 years ago

Recently ran into this as well... how about making the default option to warn such overwrites, i.e. change this to 1 or JL_OPTIONS_WARN_OVERWRITE_ON https://github.com/JuliaLang/julia/blob/52c55d7934f71c5b2d9f6e6fa98cb48817def57c/src/jloptions.c#L57

then add a way to programmatically suppress the warning/error message when the overwrite is intentional, such as Revise.jl? e.g. a macro like @overwrite. Currently, the warning messages are printed in c

https://github.com/JuliaLang/julia/blob/52c55d7934f71c5b2d9f6e6fa98cb48817def57c/src/gf.c#L1497-L1510

One possible way to implement is by adding a function to flip the warning setting,

JL_DLLEXPORT void jl_options_flip_warn_overwrite(void)
{
    jl_options.warn_overwrite ^= 1;
}

then in julia, the warning can be turned on/off via ccall(:jl_options_flip_warn_overwrite, Cvoid, ()). There should be more elegant solutions...

andyferris commented 3 years ago

I ran into this issue today. I would love to see this error out at lowering time.

c42f commented 3 years ago

It looks like this was improved in Julia 1.6 — #36609 gives a warning now.

I made this to an error in #39498 if we think that's compatible enough to merge into Julia 1.7.

StefanKarpinski commented 3 years ago

What is the issue preventing a more comprehensive error here? Do we just not know the precise conditions under which we should allow / disallow this kind of construct? What about disallowing method definitions for an inner function unless they are all in the same basic block?

iago-lito commented 2 years ago

As a new user, I've stumbled accross one case here that I find very dangerous, because Julia produced definite undesired behaviour with no warning (i.e. evaluating g(A, 5) to 25 instead of 10).

bkamins commented 1 year ago

Could someone please comment on the plans related to this issue? Thank you! (I am asking because the problem is often raised by new Julia users)