JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.61k stars 5.48k forks source link

range(start, stop, length) #38750

Closed antoine-levitt closed 3 years ago

antoine-levitt commented 3 years ago

This issue is to propose defining range(start, stop, length) = range(start, stop; length=length). I searched the issues and PR, expecting pages of heated debate, but I couldn't find any, so here goes.

Pros:

Cons:

timholy commented 3 years ago

xref #38041 (which is an open pull request)

antoine-levitt commented 3 years ago

Related but orthogonal: the proposed three-arg version range(a, b, c) would not take any kwarg

mkitti commented 3 years ago

xref #37875 which is an open documentation only pull request

johnnychen94 commented 3 years ago

FWIW, I believe #38041 can safely close #37875 since it makes the usage much simpler to intuitive.

antoine-levitt commented 3 years ago

Since nobody seems to think this is an extremely bad idea, I'll make a PR once https://github.com/JuliaLang/julia/pull/38041 is in

mkitti commented 3 years ago

What troubles me about this is the order of the arguments is not clear. I suggest a new name for this function that also communicates the order of the arguments. Keeping the current argument names I would prefer range_start_stop_length(start, stop, length) though this is a bit long.

Part of the reason for the length is that the arguments are hard to unambiguously abbreviate. start, stop, and step share the first two letters. Therefore, I propose a renaming of the arguments:

start -> begin stop -> end

This refers to existing indexing terms:

julia> a = 1:0.2:5
1.0:0.2:5.0

julia> a[begin]
1.0

julia> a[end]
5.0

julia> length(a)
21

julia> step(a)
0.2

We could then have a group of range functions that use the first letters of begin, end, length, and step to indicate the order of the arguments:

# Three args
range_bel( begin_arg, end_arg, length ) = range( begin_arg, end_arg, length = length )
range_bes( begin_arg, end_arg, step ) = range( begin_arg, end_arg, step = step ) 
range_bls( begin_arg, length, step ) = range( begin_arg, length = length, step = step )
range_els( end_arg, length, step ) = range( stop = end_arg, length = length, step = step ) # Needs 38041 

# Two args
range_be( begin_arg, end_arg ) = range( begin_arg, stop = end_arg )
range_bl( begin_arg, length ) = range( begin_arg, length = length )
# range_bs - not clear what length or end might be
range_el( end_arg, length ) = range( stop = end_arg, length = length, step = 1 ) # Needs 38041
range_es( end_arg, step ) = range(1, stop = end_arg, step = step)
range_ls( length, step ) = range(1, length = length, step = step)
johnnychen94 commented 3 years ago

We could then have a group of range functions that use the first letters of begin, end, length, and step to indicate the order of the arguments:

In this case, I'd personally use the constructor, i.e., StepRange(args...) and it is the clearest way while still simple.

I mean, it's totally fine to write some ad-hoc helpers for this, but I don't think they should live in Base.

mkitti commented 3 years ago

In this case, I'd personally use the constructor, i.e., StepRange(args...) and it is the clearest way while still simple.

That does not help this PR where the request is for range(start, stop, length). I would need to look up the order these arguments every time and would likely confuse the order of the arguments with that of StepRange(start, step, stop). In this case, would you also oppose the addition of range(start, stop, length)?

My position is that I would prefer range_bel(begin_a, end_a, length) over range(start, stop, length) if we were to add this.

mkitti commented 3 years ago

One could just do one of the following in lieu of this PR.

range(a, b, l)  = Base._range(a, nothing, b, l)
range_bel(b, e, l) = Base._range(b, nothing, e, l)

38041 implements range_start_stop_length so one could just do after that is merged.

import Base: range_start_stop_length

https://github.com/JuliaLang/julia/blob/8c327e964108d1bb702d368af69433bed1063572/base/range.jl#L505-L515

Overall, I find the original proposal confusing. 😕

antoine-levitt commented 3 years ago

Let's not overestimate the utility here of all different possible variants. Range is overwhelmingly used as range(a, b, length=N):

antoine@epsilon ~/.julia/packages $ grep -ir ' range(' * | wc -l
180
antoine@epsilon ~/.julia/packages $ grep -ir ' range(' * | grep length | wc -l
165

Of the remaining usages, most are kind of artificial:

antoine@epsilon ~/.julia/packages $ grep -ir ' range(' * | grep -v length
Colors/kc2v8/src/utilities.jl:    range(start::T, stop::T; kwargs...) where T<:Colorant = range(start; stop=stop, kwargs...)
Compat/qsiOu/test/runtests.jl:@test range(0, 10, step = 2) == 0:2:10
Compat/qsiOu/test/runtests.jl:@test_throws ArgumentError range(0, 10)
Compat/qsiOu/src/Compat.jl:        range(start; stop=stop, kwargs...)
FillArrays/tE9Xq/src/FillArrays.jl:cumsum(x::AbstractFill{<:Any,1}) = range(getindex_value(x); step=getindex_value(x),
Gtk/C22jV/gen/gbox3:    function range(range_::Gtk.GtkRange, min, max)
Gtk/C22jV/gen/gbox3:    function range(spin_button::Gtk.GtkSpinButton, min, max)
Gtk/C22jV/gen/gbox3:    function range(spin_button::Gtk.GtkSpinButton)
Gtk/C22jV/gen/gbox2:    function range(range_::Gtk.GtkRange,min,max)
Gtk/C22jV/gen/gbox2:    function range(spin_button::Gtk.GtkSpinButton,min,max)
Gtk/C22jV/gen/gbox2:    function range(spin_button::Gtk.GtkSpinButton)
NCDatasets/Zat6R/test/perf/benchmark-python-netCDF4.py:    for n in range(v.shape[0]):
NCDatasets/HhdCu/test/perf/benchmark-python-netCDF4.py:    for n in range(v.shape[0]):
PlotThemes/4DCOG/src/juno_smart.jl:        append!(grad, range(colvec[i], stop = colvec[i+1]))
Unitful/1t88N/src/range.jl:# the following is needed to give sane error messages when doing e.g. range(1°, 2V, 5)

. So clearly range is mostly used to create equispaced grids. The reason is that other uses already have the a:b:c syntax.

That does not help this PR where the request is for range(start, stop, length). I would need to look up the order these arguments every time and would likely confuse the order of the arguments with that of StepRange(start, step, stop)

Do you really use StepRange directly? Regarding the possible confusion, I agree it's ambiguous but explained in my original post why I don't think this is too much of an issue. In particular, uses involving length are usually quite different from those involving step, and there's much less chance of confusion than in, say, step and stop. I think it would be pretty clear with the proposal in the OP plus https://github.com/JuliaLang/julia/pull/38041: the prototype is range([, start, stop, length]; start, stop, step, length). All valid combinations are accepted.

I like the idea of replacing start by begin and stop by end, but both begin and end are keywords.

jebej commented 3 years ago

maybe we can get linspace back :trollface:

JeffBezanson commented 3 years ago

One downside: python has range(start, stop, step) which is very confusing.

StefanKarpinski commented 3 years ago

I agree that it's weird to require one keyword argument for a function like this, but Python's range and numpy.arange functions take start, stop, step positional arguments, so diverging from that seems like a bad idea. At the same time, adding range(start, stop, step) seems kind of pointless since we already have dedicated syntax for that construction.

Triage notes that range(start, stop) doesn't work despite the docs claiming that the default step is one, but it would probably be fine to make that work.

mkitti commented 3 years ago

You have to give stop as a keyword:

julia> range(1, stop = 5)
1:5
mkitti commented 3 years ago

My recommendation is consider #38041 which creates a Base.range_start_stop_length which implements this function exactly. We could then consider whether it would be worth aliasing and exporting that function.

antoine-levitt commented 3 years ago

@mkitti the point of this is to make things shorter: there's no real point in having a range_start_stop_length which would not be shorter than the kwarg version.

@triage Thanks for discussing this! Oof I did not know about python's range(start, stop, step), that sucks. I would still argue the potential for confusion is very limited for the reasons given above, but I understand the reluctance to do it in this case. Then yes, possibly the least bad solution is to bring back linspace from the dead? (or call it linrange or something)

jebej commented 3 years ago

Remember that we already have LinRange. I think it makes sense to have linspace back to dispatch either to LinRange or the twice-precision StepRangeLength.

DNF2 commented 3 years ago

range(start, stop, step) seems a lot less useful than the proposed range(start, stop, length). Speaking as a Matlab user, I welcome this as a drop-in replacement for linspace. The order of arguments is so ingrained in Matlab users, at least, that it's basically hardwired.

For those who are confused about order, there would still be the keywords, right?

mkitti commented 3 years ago

The implementation for the request here can be currently done in a single line of code which eventually leads to the use of LinRange

Base.range(start, stop, length) = Base._range(start, nothing, stop, length)

Base.range_start_stop_length in #38041 makes this even friendlier due to the lack of leading underscore:

Base.range(start, stop, length) = Base.range_start_stop_length(start, stop, length)

@DNF2, If you want a drop-in replacement for MATLAB's linspace couldn't you just use LinRange? It's also rather performant if you have MATLAB-level tolerance for floating point errors and using floats everywhere.

julia> LinRange(0, 20, 61)
61-element LinRange{Float64}:
 0.0,0.333333,0.666667,1.0,1.33333,1.66667,2.0,2.33333,…,17.6667,18.0,18.3333,18.6667,19.0,19.3333,19.6667,20.0

 julia> len = typemax(Int)
9223372036854775807

julia> @benchmark LinRange(0, 10, $len)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     0.001 ns (0.00% GC)
  median time:      0.001 ns (0.00% GC)
  mean time:        0.027 ns (0.00% GC)
  maximum time:     0.101 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

If you really want linspace, it's actually still there if you needed in a pinch...

julia> Base._linspace(0. , 20. , 21)
0.0:1.0:20.0

To summarize, the function proposed here is an alias for a function that currently lives in Base. The main thing in contention here is if we should call this function by a different name that is exported.

  1. The original post says LinRange is low-level, but I'm not understanding this argument after the further discussion, and I think expanding upon this would help flesh out the technical merits here. Can we clarify why LinRange does not work? Is there a specific use case where this does not work?
  2. The range call from Python makes this problematic. linspace has some nice attributes of MATLAB compatibility, but there may be a reason to be distinct from MATLAB. Are there any other aliases that could work? rangel, rangelen?
antoine-levitt commented 3 years ago

The original post says LinRange is low-level, but I'm not understanding this argument after the further discussion, and I think expanding upon this would help flesh out the technical merits here. Can we clarify why LinRange does not work? Is there a specific use case where this does not work?

This is detailed in the docs, LinRange is less careful about floating point errors than range, hence the "low-level". Also it has not-very-nice default display, which makes it look kind of like an internal thing (which should probably get fixed if this is the "recommended" way to build range with lengths). Performance of creation is irrelevant, it's just a four-int struct. Performance of indexing/SIMDization/etc could be more interesting (although pretty unlikely to matter in practice).

Generally speaking, in julia for this kind of work you don't usually call a constructor directly, and therefore using LinRange here seems weird.

I don't actually particularly care which of StepRange or LinRange a 3-arg range would dispatch to, I just want a single standard, agreed-upon, non-kwarg, non-constructor API to create a range to plot things to reduce my cognitive overhead.

mkitti commented 3 years ago

Generally speaking, in julia for this kind of work you don't usually call a constructor directly, and therefore using LinRange here seems weird.

I don't actually particularly care which of StepRange or LinRange a 3-arg range would dispatch to, I just want a single standard, agreed-upon, non-kwarg, non-constructor API to create a range to plot things to reduce my cognitive overhead.

I don't really see calling the constructor as an issue, but I can see the having to the push the shift key could be annoying. Would calling it linspace or linrange and defining them as follows work for everyone?

julia> linspace(start, stop, length) = Base._range(start, nothing, stop, length)
linspace (generic function with 1 method)

julia> linrange(start, stop, length) = Base._range(start, nothing, stop, length)
linrange (generic function with 1 method)

Would you want a one or two argument version or having a length argument of nothing be supported?

StefanKarpinski commented 3 years ago

Maybe defining range(start, stop, length) isn't so bad even though it diverges from Python:

  1. Giving the length is generally better than giving the step so it makes sense that this takes precedence.
  2. This is similar to the classic linspace function which was previously called linrange in Julia because it doesn't create a space, it creates a range. But the lin prefix is redundant—all ranges are linear. So just calling it range makes sense.
  3. If we make range(start, stop) default to step = 1 then at least that method will agree with Python (will, aside from ours including stop) which seems like the one that Python users mostly reach for.
  4. As originally argued here, since we already have start:step:stop as dedicated syntax, it seems a bit silly to make range(a, b, c) mean the same thing but with a slightly different ordering.
jebej commented 3 years ago

I think the name linspace isn't intended to mean creating a space, it's just a shortcut for "linearly-spaced vector/range": https://www.mathworks.com/help/matlab/ref/linspace.html

mkitti commented 3 years ago

Numpy also has linspace(start, stop, length): https://numpy.org/doc/stable/reference/generated/numpy.linspace.html

There length is called num with the documentation "Number of samples to generate. Default is 50. Must be non-negative."

https://github.com/numpy/numpy/blob/v1.19.0/numpy/core/function_base.py#L24

Likewise, numpy and matlab emulators have followed suit: https://www.tensorflow.org/api_docs/python/tf/linspace https://www.rdocumentation.org/packages/pracma/versions/1.9.9/topics/linspace

A potential issue for us is that in each of these cases length has a default value (MATLAB: 100, Numpy: 50)

mkitti commented 3 years ago

3. If we make range(start, stop) default to step = 1 then at least that method will agree with Python (will, aside from ours including stop) which seems like the one that Python users mostly reach for.

@JeffBezanson commented on this before as noted in the source code: https://github.com/JuliaLang/julia/pull/28708#issuecomment-420034562

Also range(start, stop) in Python does not include stop:

In [16]: [i for i in range(1,5)]
Out[16]: [1, 2, 3, 4]

In [17]: [i for i in range(5)]
Out[17]: [0, 1, 2, 3, 4]

As much as I like range, I think Python has imprinted its conventions on it.

StefanKarpinski commented 3 years ago

We're clearly not going to not include the stop point for range(start, stop) in Julia. So this function is already going to deviate somewhat from Python. The question is whether it's ok to deviate further and the existence of some deviation already does seem like it supports deviating further as makes sense. I'm kind of tired of worrying about what Python decided to do, so I'm giving this proposal my 👍🏻

mkitti commented 3 years ago

Also Range in the C++ library Boost is defined " A Range provides iterators for accessing a half-open range [first,one_past_last) of elements and provides information about the number of elements in the Range" https://www.boost.org/doc/libs/1_75_0/libs/range/doc/html/range/concepts/overview.html

mkitti commented 3 years ago

LinRange and linrange are unique to Julia. If you Google either, Julia is the first hit. Perhaps we should forge a new gloroious Julian path where we not care about precedent from MATLAB or Python with none of arbitrary length, 0-based indexing, and exclusive stops.

# 1-arg version, 1:stop recommendedlinrange(stop) = 1:stop
linrange(::Nothing, stop) = 1:stop

# 2-arg version, start:stop recommended
linrange(start, stop) = start:stop
linrange(start, stop, ::Nothing) = start:stop

# 3-arg version, conventions as proposed here, length defaults to `stop - start + 1 == length(start:stop)`
linrange(start, stop, ::Nothing) = start:stop
linrange(start, stop, length) = LinRange(start, stop, length)

# primary syntax for specifying step is
start:step:stop
mkitti commented 3 years ago

We're clearly not going to not include the stop point for range(start, stop) in Julia. So this function is already going to deviate somewhat from Python. The question is whether it's ok to deviate further and the existence of some deviation already does seem like it supports deviating further as makes sense. I'm kind of tired of worrying about what Python decided to do, so I'm giving this proposal my 👍🏻

I've likely already lost this debate, and that's perfectly fine. I'm partly here because @timholy wanted to "encourage discussion" on this topic, and I was a bit frustrated with the documentation of range earlier (#37875). I would prefer something as unambiguous and with as little arbitrariness in argument order or in argument defaults as possible.

The current discussion seems biased towards those with prior MATLAB experience. We should seek to include users with some prior Python experience in this discussion as this will likely create the most cognitive dissonance for them as they switch between languages.

start( length, start = 1 , stop = length + start - 1 )

If you'll indulge me a little further on this proposal:

Currently, the easiest way to define a range by start and length is start .+ (0:length-1) or via keyword: range(start; length=length). Since we are breaking clearly with Python on range, perhaps the two positional argument version should be not be range(start, stop) but range(start, length) or even range(length, start). The logic is that range(start, stop) is already well served by start:stop. This parallels the reasoning that range(start, stop, step) is not needed because it is well served by start:step:stop. While range(start, length) or range(length, start) would seem to contradict current documentation, a two positional argument version is not implemented. In my proposed two argument function, we assume step = 1. Since we are making it easier to define a range by start, stop, and length we should consider making it easier to define a range by just start and length.

Continuing to follow the logic, the one positional argument version should then be range(length) which is actually the same as range(stop). We assume start = 1 and step = 1. Certainly, range(start) makes no sense unless we allow ranges with infinite length, which I think is rare case.

The above sequence range(start, stop, length) -> range(start, length) -> range(length) seems hard to define. Maybe then the three positional argument definition should actually be range(length, start=1, stop=length+start-1) which is easy to document. Then we would have range(length, start, stop) -> range(length, start) -> range(length).

If we're going to break Python's range convention, we should do so for maximum benefit for Julia. Just as we lack an easy way to specify a range based on start, stop, and length, we also lack an easy way to define a range based on start and length.

To summarize, I propose:

"range(length, start = 1, stop = length + start -1)"
range(length, start = 1, stop = nothing) = Base._range(start, nothing, 
                                                       isnothing(stop) || length + start - 1 == stop ? nothing : stop,
                                                       length)
# If `stop` is `nothing` or equal to `length + start - 1`, pass `nothing` as `stop` so we can fallback to `UnitRange`

Thanks for the discussion.

antoine-levitt commented 3 years ago

I'm kind of tired of worrying about what Python decided to do, so I'm giving this proposal my 👍🏻

Yay! I OK so I'm sticking to the original plan: once https://github.com/JuliaLang/julia/pull/38041 is in I'll make a PR.

we also lack an easy way to define a range based on start and length

Yes, but do we actually need it? For sure it's nicer than start:start+length-1, which I do occasionally use, eg to address subblocks of matrices, but I wouldn't naturally reach for range to do that. I think we should focus on having nice syntax for the three most common cases: (start, stop) (with implicit step=1), (start, stop, step) and (start, stop, length). The first two are covered by : already, so there's just the last one missing. All other cases are nicely handled by the kwarg range.

A further argument for prefering (start, stop, length) to (start, length) is that while the latter might be used in relatively involved array indexing (the kind you have to stop and think about off-by-ones), the former is the kind you would find as the very first line of code of a plotting tutorial.

Re range(length, start, stop) The potential for confusion there is just too great I think. Plus, it's breaking.

mkitti commented 3 years ago

Default Arguments

Thank you, your honors, and may it please the court.

A further argument for prefering (start, stop, length) to (start, length) ...

To be clear, I'm not arguing for one or two argument range instead of a three argument range. I'm asking what would be the best one or two argument range to co-exist with a three argument range. The most intuitive manner to explain and document a one, two, and three argument range is by default arguments. In Julia, the default arguments are the terminal arguments so this potentially affects the order of the three argument version.

While we may defer adding the one or two argument versions of range, adding a three argument version of range would impact the consistency of adding them at a later time. I suggest we consider what those might be and what would be consistent.

One Argument range(length) === range(stop)

we should focus on having nice syntax for the three most common cases

The most common range syntax in Python is the single argument version. Just check the Python tutorials. It is not clear to me how putting length first or last in a three-argument range makes the syntax more or less nice. A one-argument range is nice syntax in Python, and we should consider if it could also be nice syntax in Julia.

In [1]: for x in range(5):
   ...:     print(x)
   ...:
0
1
2
3
4

We may also want this in Julia, but one-based:

julia> for x in range(5)
           println(x)
       end
1
2
3
4
5

Having a single-argument range would make it easier for someone to come over from one of today's most popular programming language.

Since Julia does not have a single-argument range, this is not breaking. It is worth considering whether we want a single argument version now. It would be great if one argument range were consistent with the two- and three- argument order that we could easily communicate in documentation

Two argument range

For the two argument range, the main question is what is the best default argument for the three argument version. Is it start, stop, or length?

The possibilities without consider order would be:

  1. range(start, stop): This is natural and intuitive, but we already have start:stop which is convenient and shorter so no one would ever use this. It's also not intuitive for stop as the second argument to become the single argument for the one argument version. length = stop - start + 1 is a common calculation.
  2. range(length, start): We do not have a compact syntax for this, and it would be useful. stop = start + length - 1
  3. range(stop, length): While similar in utility to above, it's awkward to define a range from the end. However, start = 1 is a natural default though it is one that everyone should know.

While range(length, start) is not the first two-argument range method I would consider, I do think having this adds the greatest utility of the different two argument versions. The length should come first so single-argument range(length) is a natural consequence.

Three argument range

The terminal positional arguments have to be the ones with default values for the sake of documentation and consistency.

Re range(length, start, stop) The potential for confusion there is just too great I think. Plus, it's breaking.

It is no more breaking than range(start, stop, length) and length coming at the beginning is now more intuitive to me if I think about the single argument version. To demonstrate the non-breaking nature of range(length, start, stop), I will show in the next section that it co-exists with the current keyword dependent range.

Implementation

First, let's establish that no range with only positional arguments currently work.

julia> range(6)
ERROR: ArgumentError: At least one of `length` or `stop` must be specified
...

julia> range(6, 0)
ERROR: ArgumentError: At least one of `length` or `step` must be specified
...

julia> range(101, 0, 1)
ERROR: MethodError: no method matching range(::Int64, ::Int64, ::Int64)
...

Since no position-only argument version of range currently exists, I do not see adding them would break anything. The keyword-dependent range should continue to work as shown below.

Now let's succinctly define the one, two, and three argument range which we can succinctly document as range(length, start = 1, stop = length + start -1)

# This line is the whole PR to make one, two, and three positional argument `range` happen.
"range(length, start = 1, stop = length + start -1)"
julia> range(length, start = 1, stop = nothing) = Base._range(start, nothing,
                                                              isnothing(stop) || length + start - 1 == stop ? nothing : stop,
                                                              length)
range (generic function with 3 methods)

julia> range(6) # UnitRange{Int64}, common Python-like use case
1:6

julia> range(6, 0) # UnitRange{Int64}
0:5

julia> range(6, 0, 5) # UnitRange{Int64} !!!
0:5

julia> range(101, 0, 1) # StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}
0.0:0.01:1.0

julia> range(0 ; length = 5) # UnitRange{Int64}
0:4

julia> range(1 ; stop = 10) # UnitRange{Int64}
1:10

julia> range(0; length = 5, step = 0.1)
0.0:0.1:0.4

julia> range(0, 1; length = 101)
0.0:0.01:1.0

julia> range(0, 5, length = 6) # StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}
0.0:1.0:5.0

julia> # Compare the above to range(6, 0, 5). The above should probably also be UnitRange{Int64}

julia> range(0, 1; step = 0.01 ) 
0.0:0.01:1.0

# The following is consistent with current behavior. We have not proposed that this should work.
# It still does not work after adding the one to three positional argument version of `range`
julia> range(101, 0, 1; step = 0.01) 
ERROR: MethodError: no method matching range(::Int64, ::Int64, ::Int64; step=0.01)
Closest candidates are:
  range(::Any, ::Any, ::Any) at REPL[8]:1 got unsupported keyword argument "step"
  range(::Any, ::Any; length, step) at REPL[8]:1
  range(::Any; length, stop, step) at REPL[8]:1
Stacktrace:
 [1] top-level scope at REPL[17]:1

To summarize, we have a range with one to three positional arguments that is completely compatible with the existing keyword dependent range. The one to three positional argument range can be defined succinctly due to careful choice of the argument order in range(length, start, stop). This also makes it easy to document the one, two, or three argument version as follows, which is far superior to the paragraphs you currently need to read to use the current keyword-dependent range.

range(length, start = 1, stop = length + start -1)

Summary

Because of ease of implementation, documentation, and intuition, length makes sense as the first argument. As shown above we can implement and document a one to three argument range easily and compatibly. length as the first argument can intuited because length is the only argument that really makes sense as both the first argument of many arguments and as a single argument. stop makes sense as a single argument, but not as the first argument. Neither start or step are sufficient by themselves without setting arbitrary defaults for length or stop.

While we may choose to defer one and two argument range for a later time, choosing length as the first argument of three allows for a consistent path forward. One argument range provides familiarity to Python users and softens the difference between one and zero based indexing. Two argument range(length, start) adds new utility rather than being redundant like range(start, stop).

length as the first argument of three can be the beginning of a consistent and intuitive positional argument system for range. Please, consider length as the first argument.

mkitti commented 3 years ago

OK so I'm sticking to the original plan: once #38041 is in I'll make a PR.

It is not clear to me #38041 will be merged. It has received less commentary despite it's merit. To provide an independent option to merge, we should submit an independent pull request that is not dependent on #38041 given the interest here.

antoine-levitt commented 3 years ago

The most common range syntax in Python is the single argument version. Just check the Python tutorials. It is not clear to me how putting length first or last in a three-argument range makes the syntax more or less nice. A one-argument range is nice syntax in Python, and we should consider if it could also be nice syntax in Julia.

This is only because python lacks explicit ranges with :, which we don't. The idiomatic julia way of manipulating integer ranges defined by start, stop and possibly step, is :. Anybody coming from python is probably relieved that they don't have to use range so much. Julia's range is only for "exotic" cases, and should continue being so.

OK, I agree with your argument that range(length, start, stop) packs more utility than range(start, [stop, length]) because range(length, start) is more useful than range(start, stop). I mistakenly thought that range(start, stop) was currently allowed, but I see that's not the case. However range(start, stop, length=len) is currently valid, and having both that and range(length, start, stop) is much too confusing to me. With start, stop, length, we lose the kwarg-free length, start, but we keep consistency with the current version.

I was under the impression that https://github.com/JuliaLang/julia/pull/38041 was less controversial than this change and would certainly be merged faster. The implementation after that PR is nicer so I'd rather wait, there's no hurry here. If it looks like https://github.com/JuliaLang/julia/pull/38041 is not going to be merged I'll do a PR against master.

antoine-levitt commented 3 years ago

So to summarize my current proposal is to start off from the semantics from https://github.com/JuliaLang/julia/pull/38041: three arguments must be provided, or two non-step. Then we simply add length as positional argument. This documents as range([start, stop, length]; start, stop, step=1, length). This means that range(start) is forbidden, but range(start, stop) and range(start, stop, length) are OK.

mkitti commented 3 years ago

I would hope for better documentation than that. I need to know which combination of positional arguments and keywords actually work. I have no idea by looking at range([start, stop, length]; start, stop, step=1, length)

My best shot is something like ...

range(start, stop, [length]) - Positional interface for `range`, at least `start` and `stop` must be provided and `length` is optional
range( ; [start, stop, step, length]) - Provide any combination of three keywords
range( ; start, stop) - Provide both start and stop as keywords
range( ; start, length) - Provide both start and length as keywords
range( ; stop, length) - Provide both stop and length as keywords
range(start; stop, [step, length]) - Provide start as a positional argument along with stop and optionally step or length as keywords
range(start, stop; step, length) - Provide start and stop as a positional arguments and either step or length as a keyword argument
antoine-levitt commented 3 years ago

I think range([start, stop, length]; start, stop, step=1, length) + the rule that at least three args (two if non-step) are needed is clear enough, no?

mkitti commented 3 years ago

Well if you specify start, stop, and length, step is definitely not 1 anymore. Also what combination of parameters gets you a UnitRange?

Anyways, for the related PR to this issue, I would just focus on the positional arguments. Let #38041 deal with the keyword arguments. I would document the all positional range as a distinct method from the keyword-based method.

antoine-levitt commented 3 years ago

Also what combination of parameters gets you a UnitRange?

I would say that's an implementation detail as far as doc is concerned. Re docs, let's wait until https://github.com/JuliaLang/julia/pull/38041 is merged and we can discuss on the upcoming PR.

mkitti commented 3 years ago

UnitRange is mentioned in the current documentation.

help?> range
search: range LinRange UnitRange StepRange StepRangeLen trailing_zeros AbstractRange trailing_ones OrdinalRange

  range(start[, stop]; length, stop, step=1)

  Given a starting value, construct a range either by length or from start to stop, optionally with a given step
  (defaults to 1, a UnitRange). One of length or stop is required. If length, stop, and step are all specified, they
  must agree.

From that you might expect a UnitRange by default or if step = 1. But that is not what you will get:

julia> range(1, 3.; step=1)
1.0:1.0:3.0

julia> typeof(range(1, 3.; step=1))
StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}

I wrote up documentation on how to create a UnitRange in the extended help of range in #37875

https://github.com/JuliaLang/julia/blob/20b84f401c15653c0ae303118a7c7dd47cd1cf1c/base/range.jl#L158-L167

mkitti commented 3 years ago

@MasonProtter just had a great idea in the Zulip chat. Why don't we give start and stop together via some connected syntax. For example:

julia> Base.range(unit_range::UnitRange, length=nothing) = Base._range(unit_range.start, nothing, unit_range.stop, length)

julia> Base.range(pair::Pair, length=nothing) = Base._range(pair.first, nothing, pair.second, length)

julia> range(1:2)
1:2

julia> range(1:2, 5)
1.0:0.25:2.0

julia> range(3 => 5.6, 261)
3.0:0.01:5.6

We could even switch the order around and have both forms.

julia> Base.range(length, unit_range::UnitRange) = Base._range(unit_range.start, nothing, unit_range.stop, length)

julia> Base.range(length, pair::Pair) = Base._range(pair.first, nothing, pair.second, length)

julia> range(5, 1:2)
1.0:0.25:2.0

julia> range(1:2, 5)
1.0:0.25:2.0

julia> range(100, 0 => 2π)
0.0:0.06346651825433926:6.283185307179586
mkitti commented 3 years ago

Noting that Colors.jl has a three-position range with length = 100 http://juliagraphics.github.io/Colors.jl/stable/colormapsandcolorscales/#Base.range

antoine-levitt commented 3 years ago

As discussed in https://github.com/JuliaLang/julia/pull/39071, using linrange here would have a nice consistency with logrange, and not using range would allow for other features like the selection of endpoint inclusion.

mkitti commented 3 years ago

It would be good to revisit the conversations in #25896 in #28708 to see where many of these matters were previously discussed including the existence of linrange(start, stop, length) and logrange and their deprecation. If something has changed since then, it would nice to point out what is different.

JeffBezanson commented 3 years ago

Triage is in favor of this.

StefanKarpinski commented 3 years ago

Triage is inclined to do this and also supports range(stop) to mean range(start = 1, stop = stop).

mkitti commented 3 years ago

Triage is inclined to do this and also supports range(stop) to mean range(start = 1, stop = stop).

Does this mean no two argument then? No two positional argument version is currently defined.

Per above, I would be interested in range(start, length) or range(length, start) over range(start, stop) because colon handles the latter case well: start:stop .

antoine-levitt commented 3 years ago

I'm kinda disappointed we're not going with linrange instead, but triage hath spoken. I'll make a PR tomorrow.

It's impossible to make the 2-arg version be anything else than range(start, stop) without being inconsistent with the existing range(start, stop; kwargs). I'll make a PR with range(stop), range(start, stop) and range(start, stop, length) once the refactor PR is merged.

StefanKarpinski commented 3 years ago

Adding linrange in addition with similar but different behavior just seems a bit waffling. The range function already produces a linear range, so leaving it with an incomplete API and adding linrange instead so that we don't have to pick one behavior over the other seems like the design by committee solution that leaves people forced to memorize two different, incompatible behaviors instead of just learning one. We should pick a behavior for range and go with it.

I'm torn on which set of positional signatures makes more sense. One option is the "stop-oriented" design:

The other option is the "length-oriented" design:

Note that these only different in behavior of the middle two-argument signature. Are there any other positional schemes that have been proposed that I'm missing? The other one would be for step to be the third positional argument, but we've already rejected that since start:step:stop exists.

antoine-levitt commented 3 years ago

Yeah, I just liked the consistency linrange/logrange and the possibility to add more kwargs to linrange. It's a bit orthogonal anyway in the sense that we can still do range(start, stop, length) and add linrange later if we want (although the point of this would be diminished, of course).

The most consistent version would be to follow the principles "three arguments, or two non-step, in which case step is 1" plus the order (start, stop, length) to the end. This would mean forbidding one positional argument (since it doesn't match the rule) and have range(start, stop) (which we can't really change for compatibility with the current range(start, stop; kwargs) anyway). Forbidding range(stop) would also prevent python people from tripping themselves with range(stop) (and, let's face it, it's not particularly useful)

mbauman commented 3 years ago

Forbidding range(stop) would also prevent python people from tripping themselves with range(stop)

I see a Julian range(3) == 1:3 as being the appropriate conversion from the python range(3) == [0,1,2] (ok, that's python 2 but you get the point). Perhaps take these one at a time — first three arg (start, stop, length) which is highly desired and agreed upon, then separately the 1-arg (stop,). Note, too, that it becomes more useful if it returns a Base.OneTo — which is commonly used in high-performance axis munging.