crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.33k stars 1.61k forks source link

[RFC] Make some cases of Array#==(Tuple) and Tuple#==(Array) return true #529

Open jhass opened 9 years ago

jhass commented 9 years ago

This could allow for some nice and efficient comparisons, for example bytes[0..2] == {0x00, 0x02, 0x03} instead of bytes[0..2] == [0x00, 0x02, 0x03], which would save one array allocation.

JacobUb commented 9 years ago

I thought it would work that way already. It seems common sense. At the very least, I think Array#=== should work like that. At the same time I wonder if it could be possible, during compilation, to turn array literals into tuples for efficiency if they're treated like they could be latter in the code.

asterite commented 9 years ago

I think we can add these comparisons. Ideally comparing Array, StaticArray, Slice and Tuple between them should work: the comparison would be by length and contents. But I'd like to know waj's opinion, let's wait until he comes back :-)

PragTob commented 9 years ago

any update on this one? :) @waj

jhass commented 9 years ago

@waj please change the draft label to accepted or close the issue when you had time to form your opinion on this.

Nakilon commented 9 years ago

Guys is there a roadmap about those classes Slice, StaticArray, etc.? Such as this that helped in 2008 to understand Ruby classes inheritance, but maybe related more to the implementation to teach how can I use low level Crystal features for program performance.

sevk commented 8 years ago

+1

0.0 == 0     => true
[0] == {0}   => true
'1' == "1"   => true
straight-shoota commented 6 years ago

There hasn't been any feedback from @waj for two and a half years. So maybe this needs to go on without his opinion.

There has not been any argument why this shouldn't be implemented.

It makes sense to be able to compare collections of different types directly. And I'd add Deque to the list, it is a similar data structure.

HertzDevil commented 3 years ago

Overriding #== sounds like a bad idea. An Array should never be a Tuple. A Char should never be a String. Somewhere here was already a lengthy discussion of whether 0 and 0.0 should refer to the same entry in a Hash(Int32 | Float64, V). Ruby is able to circumvent this because it has Object#eql? and Object#== separately; unless we have the same facilities, we shouldn't define these comparisons with #==.

It seems to me we could define Indexable#same_elements?(other : Indexable) : Bool instead, or even extend it to Enumerable (see #1452 also):

[0, 2, 3].same_elements?({0, 2, 3}) # => true

The Indexable case would be equivalent to:

def same_elements?(other : Indexable)
  size == other.size && (0...size).all? do |i|
    unsafe_fetch(i) == other.unsafe_fetch(i)
  end
end
oprypin commented 3 years ago

Maybe even [0, 2, 3] === {0, 2, 3}??

asterite commented 3 years ago

So, we have === to compare Char and UInt8, I think having === to compare such types could be good, but I don't know.

HertzDevil commented 3 years ago

Not possible at this point, because Tuple already defines #===(Tuple) to compare elements using case equality.

j8r commented 3 years ago

An overload #===(Array) can be possible @HertzDevil .

HertzDevil commented 3 years ago

Actually:

module Indexable(T)
  def equals?(other : Indexable)
    equals?(other) { |x, y| x == y }
  end
end
straight-shoota commented 3 years ago

Java's List interface (which is roughly equivalent to Indexable) expects equals? and hashCode to be generically implemented by implementing classes (Javadoc). So I wouldn't be opposed to follow that example and define generic implementations for #hash and #<=> in Indexable and include Comparable(Indexable) (cf #10065).

straight-shoota commented 3 years ago

It turns out #hash is already defined on Indexable. Seems nobody mentioned that so far. So there's actually a discrepancy:

[0, 2, 3] == Deque{0, 2, 3}           # => false
[0, 2, 3].hash == Deque{0, 2, 3}.hash # => true

That shouldn't be.

I'm now convinced that implementing a generic Indexable#==(Indexable) is a good idea, so there is #10111

Btw. for Tuple both are false because Tuple overrides hash (discussed in #10065).

oprypin commented 3 years ago

I don't see how hash helps the argument or what exactly "shouldn't be" 😕

straight-shoota commented 3 years ago

hash is related to equality in the way that if two values are equal, they should produce the same hash. This consequence is already realized in Indexable#hash. So stepping to equality is nearby. With the related discussion on comparability (#10065) it receives an even more pronounced argument.

asterite commented 3 years ago

But hash equality doesn't imply equality.

I can't see how a deque is the same thing (equal) as an array. Or any other collection for that matter.

We can probably add a same_elements? method. But it's not equality.

straight-shoota commented 3 years ago

Yeah, it doesn't. But when we already use a generic hash implementation for Indexable, it's not hard to imagine that it could also define equality.

Let's not talk about equality but comparability: Int32, UInt8 and BigInt all represent the same concept - a number - expressed in different data types with different specific properties. But the values are nevertheless comparable. When I have an Array of numbers and a Deque of numbers we should be able to compare them, too. It's a similar story: Both model the same concept - a list of values as defined by Indexable - even though the concrete implemenation of the data type differs. Based on the common Indexable interface, generic comparability can be defined. This is the idea of #10065.

Comparability is expressed in the Comparable interface which also includes a #== method, thus all values that are comparable also define equality as an implication.

The equality operator == is probably at fault, because it is overloaded with different interpretations. It can mean "be exactly the same thing" including internal data type respresentation or it can mean "represent the same value". The definition of Comparable#== definitely follows the latter. So do other implementations of #== which define equality regardless of data type. Following from that, == in Crystal does not imply type equality.

Maybe we should have separate methods for both concepts. But I actually don't see much use cases for the stricter definition, which could be expressed as a == b && a.class == b.class. Where do you actually need that specifity? Defining comparability and equality for all indexables doesn't break a single spec in this repo. Sure, it essentially widens the definition range, but still nothing depends on equality being tied to the data type.

asterite commented 3 years ago

I think what you say makes sense. Plus it's probably harmless to allow such comparisons, and it can actually be useful.

oprypin commented 3 years ago

I have made up my mind, I am definitely against making Indexable Comparable(Indexable).

The most I'd allow is Comparable(self) with the possible addition of <=>(Indexable) as @HertzDevil suggests.


One of your arguments in the PR:

Regarding downsides to this, I don't think there are any.

If you keep moving in that direction, you can end up with /foo/ == "foo" because why not. This is a strongly typed language, don't forget that.


One particular situation is, imagine a type that happens to be indexable but also has other fields.

If you add that field to your class, and you want it to be equal only to other instances that have the same field, you're out of options, other indexables will forcibly compare equal to yours. You also have your hands tied in terms of the hash implementation.


In fact, it doesn't need to include other fields.

Check out this VideoFrame class. Should a video frame ever be equal to an array?

Should a Matrix ever be equal to a deque?

asterite commented 3 years ago

Good points.

I actually don't know what's the use case behind all of this.

straight-shoota commented 3 years ago

The most I'd allow is Comparable(self) with the possible addition of <=>(Indexable) as @HertzDevil suggests.

Comparable(self) could be an option, but it's very limited and highly restricts usability because indexables can have different union types. So IMO that's not a viable option.

<=>(Indexable) without Comparable(Indexable) makes no sense whatsoever. Either Indexable is comparable or not. If you can define a generic <=> method, you can also include Comparable which only enhances that method by adding useful helper methods.

If you keep moving in that direction, you can end up with /foo/ == "foo" because why not.

No that's not the direction I'm pointing. /foo/ == "foo" is like "1" == 1 or "a" == 'a' == 97 comparing different concepts. But {1, 2, 3} and [1, 2, 3] are similar concepts. They're both lists of things (=Indexable) and that shouldn't hinder to compare them when it's simplified just a matter of stack vs. heap allocation.

This is a strongly typed language, don't forget that.

== is already defined for any combination of types and just returns false by default. So this really isn't strict typing.

One particular situation is, imagine a type that happens to be indexable but also has other fields.

Yes, that would be a limitation. But is there any practical relevance to that? Or in other words: A compound type that is decisively more than a container of items probably shouldn't include Indexable anyways.

The examples you mentioned seem to fit perfectly fine with Indexable and its comparability. I don't see any reason why you shouldn't be able to compare them with other data types. As mentioned above Java's equivalent List type works like this and it doesn't seem to be a problem.