JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.87k stars 5.49k forks source link

Iterator interface specification #45686

Open LilithHafner opened 2 years ago

LilithHafner commented 2 years ago

From the latest manual, we have

Important optional methods Default definition
Base.IteratorSize(IterType) Base.HasLength()
length(iter) (undefined)

I read "optional" to mean I can define an iterator without it, but it seems inconsistent to have IteratorSize HasLength and yet have no length

If we're not going to make the breaking change of redefining the default IteratorSize, this should probably be documented differently in the manual. At least one of IteratorSize or length is required, and this should be unambiguously indicated whilst still not insinuating that either one individually is required.

elextr commented 2 years ago

Can't iterators be infinite? Those can't define length(). Also even a simple increment of a 64 bit number iterator is effectively infinite as far as collect is concerned.

LilithHafner commented 2 years ago

Thanks! I agree that length should not be required, and edited the OP to clarify the issue is the combination of claiming to have a length without having one.

elextr commented 2 years ago

A little further down the docs from the piece extracted in the OP it says HasLength() requires the existence of length().

JeffreySarnoff commented 2 years ago

On occasion, I have been surprised to find a type that seemed to carry a notion of length to be, in fact, length-less. Why is an iterator that keeps on providing next iterates bereft of attributable length, while an iterator that would have provided next iterates for another three minutes (except the machine was rebooted for unrelated reasons) length evaluable? How many iterables is too many to be considered a sequence for which length is an appropriate computational or assignable concept?

If there exist realizations r of a leaf type T for which rlength = length(r) works, I am of the opinion that there should not exist realizations s of that type T for which slength = length(s) does not work (cannot be evaluated, always errors, ..). If an iterable continues until someone pings a given server, then length(that_iterable) is not known before the iteration, nor is it known during the iteration. If an iterable is to provide metronome-like beats of true or false to the environment ( isodd(time_ns() >> 3) called every 50 milliseconds within its own thread, writing with accesses priority to a location that is readonly for other threads ..) then what is the length of a nonterminating sequence or the length of an as yet undermined sequence?

I do not see why length(nonterminating_iterable) or length(as_yet_of_indeterminate_extent_iterable) should throw an exception. Julia is -- often and in wonderful-to-encounter ways -- conducive to good design and a partner in clean coding. To make some of one type of iterator respond to length and other of that same type of iterator decamp rather respond to length, to disrupt the data flow aor hijack smooth operational computations is .. not my cup of tea.

I have created integer number types in Julia for Julia that repurpose the most negative value (which does not have an abs()), to be an integer typed value that serves a semantic purpose. With typeof(length(iteratator)) of similar type and with code that provides the instantaneous interoperability (just reinterpret(Int32, x::Count32)) there is -- that ??

do you have a simpler way that keeps things that should not require special circumstances unspecial?

elextr commented 2 years ago

One key point to note, "iterator" is not a type, it is an informal interface that any type may implement.

That interface has required functionality, and optional functionality, some uses only use the required functionality, and can use any type that implements the iterator interface. Some uses require optional functionality, and they should only be able to use iterators that provide that functionality, something that should be checkable at compile time. That seems better than forcing all iterators to provide all functionality, and throw if it doesn't make sense.

Consider a type with an iterator interface that returns unique id strings, it has no logical length, UUIDs are infinite (although on current hardware it is likely to be limited by mundane things like finite memory, but it might be hard to calculate that limit). It can be zip()ed with another iterator of a more tractable finite length, thus conveniently and simply associating the elements from that iterator with a unique id. But (IIUC, I didn't read the code) neither iterator needs a length() function to do that.

JeffreySarnoff commented 2 years ago

ok not a type -- yet I agree with much of your take, given they are not types

I do think of each semi-codified api as if it were a type, for me this simplifies some design substructure.

LilithHafner commented 2 years ago

@JeffreySarnoff This is a bit off-topic, but I think it is a valid critique. Perhaps https://github.com/JuliaLang/julia/discussions/43773 is relevant to the case when you want a length but need to efficiently handle the case where none is available. I too am somewhat dissatisfied with Julia's informal interfaces.

dpinol commented 1 year ago

why doesn't Julia have an Iterator abstract type which provides iterate like on most languages? Otherwise, we're not able to type function arguments which can be either Vectors or Generators

adienes commented 1 year ago

Because that would require Vector <: Iterator and Generator <: Iterator, and in the general case it is not feasible to have Iterator as a supertype of everything that could potentially be iterated, and at the present moment Julia does not have formal/comprehensive support for traits or multiple inheritance.

LilithHafner commented 1 year ago

5