Open librasteve opened 9 months ago
Seems that use of Nil as an endpoint for Range is already problematic:
Welcome to Rakudo™ v2023.10. Implementing the Raku® Programming Language v6.d. Built on MoarVM version 2023.10.
Nil..0 #Nil..0
but
0..Nil
Use of Nil in numeric context
in block <unit> at <unknown file> line 1
in any <main> at /Users/stephenroe/.rakubrew/versions/moar-2023.10/bin/../share/perl6/runtime/perl6.moarvm line 1
in any <entry> at /Users/stephenroe/.rakubrew/versions/moar-2023.10/bin/../share/perl6/runtime/perl6.moarvm line 1
So somewhere in the code there's an "implicit assumption" that Nil < 0
So, a cleaner approach would be to define a (new) symbol for the null Range such as:
𝟘
𝟘 MATHEMATICAL DOUBLE-STRUCK DIGIT ZERO Unicode: U+1D7D8, UTF-8: F0 9D 9F 98
with texas variant like
Range()
Then "all" that needs to be done in core is:
https://docs.raku.org/language/operators#Identity would also need to be updated
I'm not following your reasoning. To me 0..0
is equivalent to a List containing a single 0. As such I expect it to be True as a non-empty List coerces to True.
We can express an undefined Range already. It is (Range)
.
I do agree that ranges with undefined values (Nil, Type-Objects) should result in an undefined Range, so that any :D
-type smily would triggers as expected. My Sprachgefühl (sorry, no english word available) tells me, Nil..Nil
should turn into (Range)
because Nil
indicates that absense of a value but definite Range
-objects are values. Thus, here is something deeper amiss. The construction of such an impossible Range object could be a compile time error, because the value is impossible. For the case that the object has to be contructed a runtime, things get even more problematic. Nil.Int
warns and turns the absense of a value into 0
. If we want consistency, we should preserve Nil
and warn. So, Nil..Nil
should iterate to Nil
, have .so & .defined ~~ False
and .elems == 0
.
Ranges with type objects like Int..10
, Any..Any
should at least return (Range)
and thus be undefined and not definite. I'm undecided if they should warn
but leaning to, yes, they should.
I can't thinks of a case where the construction of an impossible Range would not be a bug. So fail
with a distinct X::Range:Impossible would also be an option. That way we can produce a proper error message that contains both endpoints given to the constructor of Range
to help debugging.
The whole thing is tricky to decide because Raku is ment to soldier on, even if odd values pop up in any container that is not guarded by :D
.
What about:
Ranges with Nil
as an endpoint become failures. (Rationale: Nil
denotes absence of a value or a benign failure. Using one as a range endpoint promotes to a non-benign failure.)
There's an "EmptyRange" symbol that denotes a range with no elements (as against a range with one element, 0
).
The syntax I suggest for an "EmptyRange" is EmptyRange
. (Rationale: If we're to have a built in Unicode symbol for an empty range then 𝟘
as the positional complement to the associative / set related ∅
seems OK, but is on the other hand disappointing in that it undermines the value to users of using the full range of 𝟘
, 𝟙
etc for their own purposes. Imo the alphabetic equivalent would best not be Range()
because that most naturally remains what it is today, a coercion type object equivalent to Range(Any)
. There's currently an error if Range(Empty)
is written as a function call, but it's accepted as constraint on a parameter in a signature, and doesn't feel right for other reasons. I don't think it's right to let Range.new
be given no arguments; that would mean we miss the opportunity to catch that happening by mistake.)
The semantics I suggest for "EmptyRange" is that: it's a Range
instance, so it's :D
; it's also .defined
; it boolifies as False
. Other semantics as @librasteve suggests. (Rationale. None, other than it makes sense to me. :))
I find that the best way to characterize an interval (a Range) is that it is a set of values such that the set is possibly of infinite cardinality, eg when you have 2 real numbers or 2 character strings as endpoints, in contrast with a typical set defined in terms of an enumeration of individual values and therefore is of finite cardinality. This is particularly true in the case of a set of intervals which can be non-continuous. And so, when considering what is reasonable behavior, we should try and define semantics for a Range which are logically consistent with semantics for a set.
Continuing on, for the generic case of a Range of Any...
I feel that while it may not make sense conceptually to sort different types, eg it doesn't make sense to sort numbers and strings, if Raku at least provides a deterministic default generic sort that can compare two Any and give a defined result, then a Range can always function with any 2 end points no matter what they are. Same as generic equality should always be defined between any 2 things, returning a defined result, although arguably that is easier to argue for and implement than generic sorting. And if it doesn't make conceptual sense, the solution is to just not combine both types in the same Range, but if someone does it, it should "work".
Some very good & thoughful comments - let me try and summarize my reaction (corrections to the are welcome if anything is misconstrued):
As a consumer of raku, I think the following makes most sense:
(0..0).so #True
<=== this
this is what we have today in 6.e.PREVIEW - phew!
rationale
(0..0).Set #Set(0)
(0..0).Set.so #True
(0..0).List #(0)
(0..0).List.so #True
(Range)
is good (thanks @gfldex ! - I couldn't see the wood for the trees)
(Nil...Any) or (Any...Nil) or (Nil..Nil) returns Nil
ie. Soldier on by propagation of Nil outwards (compile time or runtime)
(Any:U...Any:D) or (Any:D...Any:U) or (Any:U..Any:U) returns (Range)
ie. Propagate undefinedness outwards
Implementors (see below) can use (Range)
or its derivatives as the return value when no defined result exists.
I am not sure about warn.
Specifically on my Math::Interval module, this has the following effect:
> my $i1= (0..0).Interval #0..0
> my $i2= (1..1).Interval #1..1
> my $i3 = $i1 ∩ $i2 #Set() <== this is now wrong
> my $i3 = $i1 ∩ $i2 #(Interval) <== I will change to this
> (Interval) ~~ Range #True (since class Interval is Range)
I feel that while it may not make sense conceptually to sort different types, eg it doesn't make sense to sort numbers and strings
To reassure this: 1..'foo'
wouldn't even work currently, as the type of the left side governs the understood type of the Range
.
As I see it, ...
triple-dot Sequences are currently broken: #407
Whatever is decided for Ranges, please table it until ...
triple-dot Sequences are fixed. Otherwise we won't have a reliable indexing mechanism in Raku.
Thank you.
@jubilatious1 ... I am relaxed about what the core dev team decides has priority, but I am surprised that comma separated triple-dot Sequences are the go to option for array indices - anyway that issue is the place to discuss more fully, I guess...
Currently, v.6.e.PREVIEW, we have the following:
My (naive) assertions are that:
0..0
should be reserved as the (only) 'zero Range' so thatand that (if ever defined) a Range div Range operation could then fail with 0..0 as the denominator
and
Nil..Nil
should be reserved as the (only) 'null Range' so thatSince 6.e.PREVIEW is making a breaking change to the behaviour of Nil..Nil, I would suggest we try to squeeze in the latter as a fix prior to closing 6.e production...?
See this SO question for some context: https://stackoverflow.com/questions/77608044/is-there-an-elegant-way-in-raku-to-see-if-two-ranges-overlap