JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.58k stars 5.48k forks source link

The infix `in` has a dynamic scope. #5416

Closed pygy closed 10 years ago

pygy commented 10 years ago
in(a...) = false  # in (generic function with 1 method)
1 in [1,2,3] # false

Furthermore, overwriting in doesn't trigger any warning. Should I file another issue?


Aside: something tells me that it must have already been debated to death, but, anyway... I think it's a bit of a waste to reserve short, meaningful variable names for functions that are not meant to be used directly by the general users[0]. I think that in, start, next, done would be more useful in user code.

Maybe you use in as a function name for mirroring the operator in higher order functions, like you do for other infix operators? That would make sense.

The functions related to the iterator protocol could be pre- and/or post-fixed by (an) undrescore(s). Or maybe startiter, nextiter, doneiter ?

[0] i.e. not the language and library authors.

jiahao commented 10 years ago

It seems like the methods are not being loaded into memory correctly. If you begin a new Julia session with methods(in), an error is thrown if you then attempt to define in(a...) = false. This also happens with other functions like map.

pygy commented 10 years ago

Indeed, it solves the (global) method definition problem, but not the scope issue.

methods(in)
in(out) = exit() # ERROR: must be imported explicitly

# but then

function ininin()
   in(inin...) = true
   in in in
end
ininin() # => true!
JeffBezanson commented 10 years ago

That isn't what dynamic scope means. The only special thing about in is that you can call it with infix syntax. In fact we may end up allowing all functions to be called this way.

StefanKarpinski commented 10 years ago

You can define your own local meaning of start, done and next without any issues. Do you have an example where there is a problem with this sort of thing?

JeffBezanson commented 10 years ago

The reason it works this way is that x in y has a normal lexical occurrence of in , so it is looked up the same as any other variable. For loops do not explicitly mention start done next, so the loop is essentially expanded like a hygienic macro.

jiahao commented 10 years ago

Perhaps this issue was closed prematurely. It seems like a bug to be able to define a method for a function that shadows methods in Base without throwing a warning or error.

JeffBezanson commented 10 years ago

It's just lexical scope. Saying in = 2 is no different than introducing any other variable, and if we gave a warning every time a variable happened to match something in Base, then we really would be reserving all of those names, which most people feel is not desirable (including the original issue description here).

nalimilan commented 10 years ago

As I understand it, the original issue description complains that non user-facing functions shouldn't be reserved words. I'd prefer that overriding any method in Base print a warning (or even fail by default). It's so easy to screw up with this kind of silly thing, and there are soo many other possible names to choose.

pygy commented 10 years ago

@StefanKarpinski: in the REPL

start # or start("hey!"), or methods(start)
start(a...) = "Boom"

Like in, actually, it only gives an error if you've tried to read the value first. It is a bit worse with in, because evaluating the a in b syntax will make further assignment to in problematic, whereas running a for loop doesn't "activate" the iterator symbols in the current scope.

@JeffBezanson: Regarding scope, the case of the infix operators is degenerate, because, after syntactic transformation, they end up calling their namesake, but it is more readily apparent with indexing operators:

getgetindex(getindex) = getindex[getindex]
getgetindex(println) # => printlnprintln

The indexing operator ends up calling whatever function is bound to getindex in the current scope. The same goes for hcat and vcat for array literals. Somehow, Dict literals are fine.

It makes the language brittle. It should not be possible to break it accidentally.

Where there's syntax, there should be hygiene. Have each and every syntactic construct call Base.x, and don't export these functions.

As a side effect, it will make bare modules nicer to work with.

Please re-open this.

In fact we may end up allowing all functions to be called this way.

"A gentleman is someone who knows how to play the bagpipes, but refrains."

That's probably not the best idea you've had so far. Or at least, not with bare words. Keep the syntax simple. A familiar infix syntax makes a lot of sense for math operators, but I'm not sure there's a lot to gain in generalizing it. If only for the fact that you'll need to have the precedence rules in mind in more contexts.

What are the perceived advantages?

The infix in already introduces a few ambiguities ([a on b] vs [c in d], @foo im in io ip, ...), generalizing this will make it much worse.

JeffBezanson commented 10 years ago

I would certainly hesitate to make all functions infix.

Not everybody sees this the same way. It's possible to define your own getindex function distinct from Base.getindex, and it doesn't seem quite right to reserve the special syntax only for the standard getindex.

Are you saying we should do this:

in(x, y) = false
in(1, 1)  # => false
1 in 1   # => true; calls Base version

That doesn't seem so great to me.

StefanKarpinski commented 10 years ago

Where there's syntax, there should be hygiene. Have each and every syntactic construct call Base.x, and don't export these functions.

One of the basic premises of the language is that operators like + are just functions with syntax. Among other benefits, this allows extremely powerful things like lexically binding the meaning of + and *. Your proposal would completely sabotage that – you would not be allowed to define your own versions of operators. You would also not be able to change the meaning of indexing locally. Making in(a,b) mean something completely different than a in b seems nuts. I'd rather get rid of the a in b syntax than do that. Not that I want to get rid of the a in b syntax, I just think that making the prefix and infix syntaxes mean different things is really awful.