make numbers non-iterable? - Githubissues

JuliaLang / julia

The Julia Programming Language

https://julialang.org/

MIT License

45.73k stars 5.48k forks source link

make numbers non-iterable? #7903

Closed StefanKarpinski closed 7 years ago

StefanKarpinski commented 10 years ago

@StephenVavasis has pointed out some rather confusing behavior of the in operator, including:

julia> VERSION
v"0.3.0-rc2+12"

julia> x = IntSet([3,5])
IntSet([3, 5])

julia> in(3,x)
true

julia> in(x,3)
false

julia> in("abc",19)
false

julia> in(19,"abc")
false

Worse still is this:

julia> 97 in "abc"
true

This issue is to discuss what, if anything, we can do to reduce some of this confusion.

stevengj commented 8 years ago

The need to support scalars in broadcast makes me second-guess the whole "numbers should not be iterable" argument.

mbauman commented 8 years ago

I don't think broadcast actually uses iteration. I don't see any broadcast problems in the log Tony posted. It almost certainly uses size and indexing, though.

It's a really fine line to walk. It definitely feels right to remove collection-like behavior from numbers, but we also often want numbers to behave like Array{T, 0} — which is a collection.

martinholters commented 8 years ago

Would it make sense to have something like an immutable ImmutableSingleton{T} <:AbstractArray{T,0} x::T end (with all the necessary methods implemented) around? Everywhere a number should behave like Array{T, 0} (except being immutable) it could then be wrapped with IIUC zero runtime overhead.

stevengj commented 8 years ago

@martinholters, then to do atan2.(x,0.3) you would need atan.(x, ImmutableSingleton(0.3))? This seems crazy.

The whole point of having numbers act like Array{T,0} is to enable generic code. If you force people to explicitly convert to another type, you lose that benefit.

StefanKarpinski commented 8 years ago

I think the point is that the broadcast implementation would wrap the singletons for you. Not sure how helpful it would be, but that's how I interpreted @martinholters' suggestion.

martinholters commented 8 years ago

@StefanKarpinski I was just trying phrase exactly that, but you were faster.

martinholters commented 8 years ago

The point would be that number would not be iterable by default, and functions that benefit from iterable numbers have to take care of that themselves, instead of the opposite. I'm doubtful about the usefulness myself, I just wanted to point out the option.

mschauer commented 8 years ago

One part of this seems to be easier to settle: to consider removing both Array{T,0}-like and collection-like properties of Char separately. Following @mbauman that one is not a fine line to walk, because anyway Char is not a Number type anymore and, well, why should Chars behave like Array{T,0}s at all?

stevengj commented 7 years ago

(Note that broadcast no longer requires size etc. to work for numbers.)

stevengj commented 7 years ago

As I wrote on the mailing list, I suspect that a lot of the need for iterable/indexable numbers should be gone now with 0.5's dot-call syntax. In the cases where you would previously have written a generic vector/scalar function, you should now just write the scalar function f(x), and then apply it to arrays A with f.(A). This is not only easier, it is also faster because it can fuse with other elementwise operations and the result can be assigned in-place with .=.

stevengj commented 7 years ago

It's instructive to try to patch Base to make numbers non-iterable. I'm finding various cases where removing iterability requires much uglier code. For example:

In split, it calls r = search(string, splitter). If splitter is a string, this returns a range, but if splitter is a char, it returns an integer. Being able to call first(r) and last(r) in both cases makes the same code work for both.
In the code generation for multidimensional array indexing, it calls _nloops to generate nested loops over the indices in expressions like a[i, 3:4]. By being able to do for j in i, the same generated code can handle i::Int and i::AbstractVector{Int}. (On the other hand, this may be suboptimal, since it doesn't look at first glance like LLVM can eliminate the loop in the Int case.)
In the FFT code, you can pass any iterable of dimensions to be transformed. This allows you to pass a single dimension (integer) and have it be handled with the same code.

stevengj commented 7 years ago

On the other hand, making numbers non-indexable (removing size, getindex, etcetera), seems much less disruptive ... it looks like almost no changes are required in Base.

stevengj commented 7 years ago

The converse argument: if it is so useful to make numbers iterable, maybe everything should be iterable? i.e. just define fallback start etc. methods for Any.

nalimilan commented 7 years ago

In split, it calls r = search(string, splitter). If splitter is a string, this returns a range, but if splitter is a char, it returns an integer. Being able to call first(r) and last(r) in both cases makes the same code work for both.

This would be fixed by https://github.com/JuliaLang/julia/issues/10593 (see this Julep): you'd call findseq or searchseq, which would return an index range for both string and char arguments. The previous behavior returning a single index when passing a char would be obtained via findeq/searcheq (which wouldn't work to find a substring). So one less reason not to do this!

mbauman commented 7 years ago

And note that we could use #19730 to wrap all numbers in a specialized AbstractArray{T,0} before they're used in non-scalar indexing within to_indices. That'd give them iteration, indexing, and shape without much hand-wringing… and that could actually remove a few methods. I'm not sure if there'd be a performance impact, however, and I'd like to keep that patch as conservative as possible for now. It's already pretty big.

StefanKarpinski commented 7 years ago

I increasingly think we're not going to do this. We could make a lint warning that pesters you if you write for i = x where x is not a range expression. That would catch the cases where someone meant to write for i = 1:n and accidentally wrote for i = n instead. We could even go so far as to make that a syntax error, but that seems too draconian.

terasakisatoshi commented 4 years ago

write for i = 1:n and accidentally wrote for i = n instead

I often get the phenomena many times. Since it does not raise error e.g. syntax error, it is hard to find for i = n should have wrote for i in 1:n 🐛

StefanKarpinski commented 4 years ago

Yes, that was one of the motivations cited in this issue when it was opened.

terasakisatoshi commented 4 years ago

I've just posted my question at Julia discourse (that is why i mentioned a comment at this issue).

https://discourse.julialang.org/t/question-why-for-n-in-10-show-n-end-is-valid-rather-than-getting-error/33895

It was not clear for me why number e.g. 10 is iterable. But, by reading a discussion here https://github.com/JuliaLang/julia/pull/19700#issue-99274797. I've found it is difficult choice to decide.

non-Jedi commented 4 years ago

Was the conclusion that this definitely isn't happening even for a Julia 2.0? I've seen several complaints/confusions about it in various places over the past few months.

mbauman commented 4 years ago

It's gotta be rather convincing. We tried removing both iterability and indexability pre-1.0, but:

Removing iterability: I'm finding various cases where removing iterability requires much uglier code.

and

Removing indexability: Con: eliminating this functionality doesn't actually save us much code in Base, and it might make some kinds of generic functions more annoying to write, especially since numbers are still iterable. Is it worth it?

Some of these things have indeed changed, so it's certainly possible that the balance has shifted... but has it shifted enough? I'd bet not. It's quite a bit of churn.

timholy commented 4 years ago

I'm generally a supporter of iterability of numbers, but I have seen people get bit by it. To play devil's advocate, would it be so bad to change for d in dims to for d in iterable(dims) with

iterable(x) = x
iterable(x::Number) = (x,)

?

StefanKarpinski commented 4 years ago

... or just toss whatever wrapper we end up using for broadcasting on a number and iterate that

mbauman commented 4 years ago

Right, that's what makes this different — that iterable function is essentially a narrower form of broadcastable. We now have an entire architecture built up for this sort of thing.

stevengj commented 4 years ago

My experience in trying to implement even a small piece of this pre-1.0 (#19700) leads me to believe that changing this would lead to a huge amount of code churn over the whole ecosystem. i.e. it wouldn't be worth it without huge benefits, which I haven't seen anyone articulate beyond "slightly confusing to some newcomers".

profhbecker commented 1 year ago

One admittedly minor problem is that I cannot use Julia to teach discrete mathematics as this goes against what I teach my students:

julia> Set([3]) ⊆ 3
true

Previous