Open isomarcte opened 1 year ago
I would say it is by design: Range
defines boundaries but not specify Order
upon creation. Then every single method has to provide the Order
implicitly for itself. In particular, both the contains
methods take the implicit Order
parameter. Therefore, if we reverse Range
then the Order
used for its methods has also be reversed:
scala> val rng1 = Range(1, 10)
rng1: cats.collections.Range[Int] = Range(1,10)
scala> rng1.contains(5)
res0: Boolean = true
scala> val rng2 = rng1.reverse
rng2: cats.collections.Range[Int] = Range(10,1)
scala> rng2.contains(5) // won't work since it uses a regular Order for Int
res1: Boolean = false
scala> rng2.contains(5)(Order.reverse(Order[Int]))
res2: Boolean = true
That said, I personally find such an approach controversial quite a bit.
Range
defines boundaries but not specifyOrder
upon creation. Then every single method has to provide theOrder
implicitly for itself.
Yes, this is one of the things I think we should consider changing for 1.0. See my comment in https://github.com/typelevel/cats-collections/issues/277#issuecomment-1292455578.
Indeed, this design was rejected for the new HashMap and HashSet, on grounds of be a "Haskell cargo-cult". So at the very least, this library now has inconsistent designs.
this is one of the things I think we should consider changing for 1.0
If we could change that, then perhaps it would be a cool change. Would we consider making Range
to be not "all-inclusive" only? I mean, if we could create a Range
instance having one or both boundaries that would be exclusive, then it would make this type way more useful.
See also some discussion in https://github.com/typelevel/cats-collections/issues/315#issuecomment-770266231.
Just to clarify, are we saying that we intentionally want the usage of non-coherent instances to be normative? Like, we demand either Order[Int]
or Order.reverse(Order[Int])
on Range
creation as a design feature to change the ordering of the Range?
Using HashSet
as an example, is this an intended usage?
scala> val sillyButValidHash: Hash[Int] = new Hash[Int] { override def hash(a: Int): Int = 1; override def eqv(x: Int, y: Int): Boolean = Eq[Int].eqv(x, y)}
sillyButValidHash: cats.Hash[Int] = $anon$1@1f21f982
scala> HashSet(1) === HashSet(1)(sillyButValidHash)
res0: Boolean = true
scala> HashSet(1).add(2) === HashSet(1)(sillyButValidHash).add(2)
res1: Boolean = false
scala> HashSet(1).add(2).iterator.toList === HashSet(1)(sillyButValidHash).add(2).iterator.toList
res2: Boolean = true
I personally would never want this to be intended behavior, as if I have more than one instance of HashSet[A]
I completely lose the ability to reason about how they relate to each other, unless I've defined them in the same local scope.
I was under the impression that binding the instance at the method or at the class was mostly for convenience (and preventing damage to internal state of an already allocated structure) and not as a mechanism for using alternative instances. I would expect to use newtypes (ideally opaque types) to index alternative implementations.
The problem is if you take the instance at the call site then you can do this:
HashSet(1).add(2)(sillyButValidHash)
Which is also bad. Oscar just summarized this nicely in https://github.com/typelevel/cats-collections/issues/277#issuecomment-1315664123.
I still believe that in a language like scala where typeclasses are first class values and we don't have coherence, it doesn't make since to consume required typeclasses at callsites if using the wrong one invalidates the datastructure (which it does for hashmaps and trees with Hash and Order respectively).
I would expect to use newtypes (ideally opaque types) to index alternative implementations.
👍 yes, this is the only safe way.
:+1: You scared me for a minute.
So coming back to Range
. If we define ranges as intervals going in a single direction, then I think we can make the interface consistent and relatively easy to use. This would require the use of a newtype for reversing direction. I'm happy to POC that if there are no objections. It would also make the case for moving Inverse to cats-collections more compelling.
My intended strategy would probably be to introduce a new class Interval
and deprecate Range
.
Thoughts?
Sure, maybe Interval
is a better name for the concept anyway.
@isomarcte I think there are two strategies possible:
Range
being bi-directional but latching its Order
upon a new instance creation. Therefore, when we call .reverse
we can also reverse the Order
.
Range
becomes more complex and has to maintain more complex structure rather than just boundaries.Also as I mentioned above, I personally would find useful to have Range
that could have non-inclusive boundaries, e.g. [0..10)
which is similar to Scala's 0 until 10
.
All above is my personal perception of course.
Thanks for that distinction, at least for me I think that helped clarify things.
I think (1) is an "interval": it is a set of things, in some natural order. Whereas (2) is a "range" since you are not only specifying the set, but in which order to iterate it. I am not sure if that means we have to reverse the Order
instance: the direction is a property of the "range", and not of its element. I think :)
I am not sure if that means we have to reverse the Order instance: the direction is a property of the "range", and not of its element.
In theory yes, it should not be the property of Range. But the way it is currently implemented is that Order
is used to correlate the Range's boundaries with other items of the same type (including boundaries of another Range). Therefore, currently Order
magically becomes a property of the Range itself. And yeah, I agree it should not be so...
@satorg RE 1.CONS
Assuming the inclusion of Inverse
from the aforementioned PR, do you think something along these lines would work?
val a: Interval[Int] = Interval(1, 10)
val b: Interval[Inverse[Int]] = a.reverse
def reverse: Interval[Inverse[A]] =
Interval(Inverse(end), Inverse(start))
def contains(value: Interval[A]): Boolean = ???
def containsInverse(value: Interval[Inverse[A]]): Boolean =
contains(value.reverse)
// Just a rough sketch. I might be able to do something interesting so that
// (a: Interval[Inverse[A]]).reverse => Interval[A], and not
// Interval[Inverse[Inverse[A]]]
:t reverse
Interval[A] => Interval[Inverse[A]]
:t contains
Interval[A] => Interval[A] => Boolean
:t containsInverse
Interval[A] => Interval[Inverse[A]] => Boolean
And I'm completely fine making it exclusive on the upper bound. I think that would even be required, since if we require that start <= end
, then the only want to define the empty interval is to have the upper bound be exclusive.
I can also just make a POC PR as well, if that would be easier for discussion.
And I'm completely fine making it exclusive on the upper bound. I think that would even be required, since if we require that start <= end, then the only want to define the empty interval is to have the upper bound be exclusive.
Yes, I think it makes sense as an option... But on the other hand.... Imagine that we need to make Interval
for the entire Short
type range. In that case Interval(Short.MinVal, Short.MaxVal)
wouldn't work since the last value will be always excluded.
Another (but similar) example:
sealed trait MyNum
object MyNum {
object One extends MyNum
object Two extends MyNum
object Three extends MyNum
object Four extends MyNum
}
Apparently we can define some Order
for such an enum as well as Discrete
(or perhaps even BoundedEnumerable
from cats) for that. But if Interval
can only have an exclusive upper bound, then there's no way to define such an Interval
that would contain all the members of an enum like one above.
As a bottomline: there's a controverity: if Interval
only allows both lower and upper bounds to be inclusive, then there's no way to define an empty Interval
. Otherwise, if it always defined as [inclusive, exclusive), then we cannot describe a full range of some limited set of values.
This is why I think it should allow both options.
On the second thought... Perhaps, the idea is too drastic, but... is there really a necessity for having classes like Range
or Interval
in principle, taking into account that there are cool typeclasses like UnboundedEnumerable
, LowerBoundedEnumerable
, BoundedEnumerable
, etc already defined in cats?
@satorg I would say yes. An Interval
type, not only providing some utility itself, is fundamental to a class of data structures called Interval Trees, of which we have a variant in cats-collection already, e.g. Diet
.
Having a base type with the correct semantics to implement these is, I think, quite useful.
I am of course heavily biased here. I started down this path because a nice Interval
type and an Interval Tree which supports key value pairs would be amazing for my work in idna4s. :wink:
I basically agree. But what I am thinking about (and yeah, perhaps it is too crazy) is to make Diet
more generic and don't stick to a particular interval/range type. Therefore if we made implementation of Diet
based on typeclasses rather than a particular range implementation, then it should be possible to plug any range implementation into it, including scala.collection.immutable.Range
, something like
class Diet[A, R] {}
where A
– element type and R
– range type
I've noticed some inconsistencies with how several methods on
Range
and that useRange
are defined when dealing with ranges which go different directions.Range.contains
contains
forRange
is not symmetric aroundreverse
.Perhaps most troubling,
This is made more confusing when looking at how
Range
is used withDiet
.There are a few ways to remedy this that come to mind, though I think the most simple would be to change
Range
to be defined as always going in one direction. If aRange
is desired that is going in the inverse order, then an newtype (perhaps Inverse) could be used.