Open luigip opened 3 years ago
this is the Scala 2 bug tracker, but the same behavior does occur in Scala 2
@scala/collections ?
Just to add: toSeq
doesn't actually fix it; you have to use toList
etc
scala> ('a' to 'z').toSeq.reverse
val res2: scala.collection.immutable.NumericRange[Char] = empty NumericRange z to a by
scala> ('a' to 'z').toList.reverse
val res3: List[Char] = List(z, y, x, w, v, u, t, s, r, q, p, o, n, m, l, k, j, i, h, g, f, e, d, c, b, a)
note that 2.12.15 doesn't have this problem
I am torn, because on the one hand, Char
s are not numbers (I don't care what they're backed by), and on the other hand, it's quite convenient to be able to have a range of them.
anyway, I believe the problem is that Char
s are, to the extent that they're numbers at all, unsigned. thus,
scala> ('a' to 'z').step.toInt
val res4: Int = 1
scala> ('a' to 'z').reverse.step.toInt
val res5: Int = 65535
Interestingly, this won't compile with a negative number, saying it needs to be a Char, but does with a positive number... :shrug:
see above about being unsigned
@NthPortal char is a numeric value type https://scala-lang.org/files/archive/spec/2.13/12-the-scala-standard-library.html#numeric-value-types
just replying to whether it's a number. Signedness is a nice catch.
this is actually a fundamental problem with NumericRange
; it doesn't work for unsigned numeric types. one could theoretically write their own u32
type, and it would not be possible to create a reverse range for that type (which is very much numeric) either. I believe it would require a fundamental change such as having a boolean to track if the range is reversed.
we might be able to fix the problem for Char to Char
by adding a custom (and private) subtype of NumericRange
specifically for Char
, and have .reverse
do some magic. I don't know if that would be feasible while maintaining bincompat, but my instinct says it is.
char is a numeric value type
Char
is numeric in that it represents a UTF-16 code point (which is a number), but numeric operations don't actually make any sense at all for Char
. 'a' + 'z'
is nonsense. 'a' * 'z'
is nonsense. 'a' to 'z' by '<START OF HEADING>'
is nonsense, but is what 'a' to 'z'
actually means (but you'd never know if I swapped '<START OF HEADING>'
for the wrong character). 'a' to 'z'
happens to get you the result you want, but adding code points is a nonsensical operation. fundamentally, I think supporting 'a' to 'z'
is wrong; it works for that alphabetical range, but anything outside the English alphabet (or even mixing uppercase and lowercase) is meaningless and arbitrary. For example, what does '0' to 'Q'
mean?
if you want a range of codepoints, (Int to Int).reverse.view.map(_.toChar)
will be clearer, and is also incidentally a workaround to the problem (e.g. ('a'.toInt to 'z'.toInt).reverse.view.map(_.toChar)
)
note that 2.12.15 doesn't have this problem
to clarify, in 2.12 ('a' to 'z').reverse
returns a Vector
. I've not yet gone digging to find out why.
Edit: in 2.13, NumericRange#reverse
returns a NumericRange
, so we cannot have it do what 2.12 does by returning a Vector
regardless.
I kind of want to override .reverse
on NumericRange[Char]
to throw an exception with the message "Sadness. You have chosen to treat a character as a number, and now look what has happened."
Just from a user's POV, I'm not aware - and I see no suggestion in the Scaladoc - that a Range of Chars being treated as a NumericRange
behind the scenes... conceptually I want it to be treated as a hypothetical AlphabeticRange
. So perhaps the exception should be a bit more apologetic - "Dang, we redesigned the collections and thought we could treat Chars like any other number, but sadly, it seems we can't. BOOM. Sorry. Please revert to Scala 2.12."
Range, as I see it, would ideally be a type of lightweight collection that will work with any type that offers an implementation - wouldn't it be nice if we could have a Range of Foo
s that works as you might expect it to, e.g. Foo(10) to Foo(20) by 2
, or even Earth to Neptune
assuming these are part of an ordered enum of planets. It would have a type parameter as all other collection types do, and the current concept would just apply to the special case of Range[Int]
. But then, this is somewhat at odds with how Range is described, as "The Range class represents integer values in range [start;end) with non-zero step value step." - so perhaps we'd need to invent a new class with a different name.
Range of Double also did not succeed. I don't think this conceptual issue due to collections, but just about the step
.
I agree that range of chars is natural, in the sense that toLower
is 'a' + c - 'A'
. But as mentioned, code points are ints.
I remember Odersky saying NumericRange
is an over-abstraction, but as noted, anything countable could have a range. Maybe Scala 3 enables syntax for GroundControl to MajorTom
etc. Someone recently asked somewhere for ..
syntax. Oh, here is the enum range: https://github.com/lampepfl/dotty-feature-requests/issues/145
(I'm repeating after @som-snytt here, but I think it bears repeating:)
Char
being a numeric type (a slightly odd one, but nonetheless solidly a numeric type) is a C and Java tradition carried on in Scala. It is arguably an unfortunate tradition, and it's definitely surprising when first encountered, but for a Range
of Char
s to be a NumericRange
is the product of decades of tradition.
references:
char
an "integral type", though unsigned: https://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.2this leads to working-as-specified behaviors such as:
scala> 'a' + 1
val res1: Int = 98
scala> ('a' + 1).toChar
val res2: Char = b
scala> 'a' + 'b'
val res3: Int = 195
scala> 'a' * 2
val res4: Int = 194
scala> 'a' * 'b'
val res5: Int = 9506
and any argument for proposed changes to the behavior Range
s of Chars
s needs to be considered in this context
reproduction steps
using Scala 3,
problem
Seems like all other Ranges give you a collection back of the same size when you
reverse
them, but a Range of Chars returns an empty Range. I understand you can get around this by turning it into a Seq (with memory usage implications) or usingreverseIterator
, but the above behaviour was surprising and caused a bug in my code.Similarly it would be nice if you could make a reversed range with
('z' to 'a' by -1)
or something to that effect. (Interestingly, this won't compile with a negative number, saying it needs to be a Char, but does with a positive number... :shrug:)