hylang / hy

A dialect of Lisp that's embedded in Python
http://hylang.org
Other
5.14k stars 372 forks source link

Hy models have wrong equality #1363

Closed gilch closed 3 years ago

gilch commented 7 years ago

Hy models are equal to other types.

For atoms, maybe this is okay?

=> (= 1 '1)
from hy import HyInteger
(1 == HyInteger(1))
True

It might even be convenient for symbols to be equal to strings, because Python uses strings pervasively when referring to its identifiers at runtime (e.g. getattr), though symbol mangling might confuse things in some cases.

But for compound types, it gets weird.

=> (= [1 2 3] '[1 2 3] '#{1 2 3} '{1 2 3} '(1 2 3)) ; ORLY?
from hy import HyDict, HyExpression, HyInteger, HyList, HySet
([1, 2, 3] == HyList(((([] + [HyInteger(1)]) + [HyInteger(2)]) + [HyInteger(3)])) == HySet(((([] + [HyInteger(1)]) + [HyInteger(2)]) + [HyInteger(3)])) == HyDict(((([] + [HyInteger(1)]) + [HyInteger(2)]) + [HyInteger(3)])) == HyExpression(((([] + [HyInteger(1)]) + [HyInteger(2)]) + [HyInteger(3)])))
True
Kodiologist commented 6 years ago

The fact that (= 'foo "foo") has created a lot of strange bugs.

vodik commented 6 years ago

I'm honestly not a fan of how (= 'foo "foo"). Sure getattr is a neat trick but its also technically represents a slight deviation away from the Python object model (though if you're stuffing arbitrary data in there, I think you got bigger issues to worry about :wink:)

I've been playing with a branch that makes (= 'foo "foo") false and similar strictness on HyKeywords. I'll put it up once I'm done fixing tests.

allison-casey commented 3 years ago

Having have started to implement this change, I agree with vodik. There's a lot of obtuse implicit conversions going on in the code base which this rightfully gets rid of. There is one issue though when it comes to quoting. Currently `(a ~(+ 2 3)) compiles to ((hy.models.Symbol "a") 5) not ((hy.models.Symbol "a") (hy.models.Integer 5)) which makes (!= `(a ~(+ 2 3)) '(a 5)) even though the values look identical from the outside. this only applies to atoms which get compiled down to their literal values during unquoting. So maybe the solution is just to not do that?

Kodiologist commented 3 years ago

That a macro can return an int or str instead of a model object, or that you can return such an object when unquoting, is a feature. I call it auto-promotion and I've worked on it in e.g. 3204a9e8a3e877a87d90c5f7a7bcca5871cc0be7. I want to keep it. I'm okay with (!= `(a ~(+ 2 3)) '(a 5)). We may want to provide a user interface to promotion so that the programmer can canonicalize model trees before testing for equality.

allison-casey commented 3 years ago

what would you suggest a user do to test tree equality if we're going to keep auto promotion?

Kodiologist commented 3 years ago

Use the interface I just mentioned; if there was some function hy.model, for example, you would write (= (hy.model `(a ~(+ 2 3)) '(a 5)).

allison-casey commented 3 years ago

so hy.model would recursively promote literals to their respective models then? Just want to make sure i'm understanding this right

Kodiologist commented 3 years ago

Yes. More precisely, it would recursively promote objects to their respective models.

allison-casey commented 3 years ago

handily enough, compiler.wrap_value already does this thanks to models.recwrap. Do we want to keep the wrap_value name or expose it as something else like model?

Kodiologist commented 3 years ago

Maybe a better name could be thought of, but I like model.

allison-casey commented 3 years ago

maybe recursive-quote or recquote? since that seems to be more accurate to what its doing.

scauligi commented 3 years ago

Hold on, on latest master I have

=> (= `(a ~(+ 2 3)) '(a 5))
hy.models.Expression([] + [hy.models.Symbol('a')] + [2 + 3]) == hy.models.Expression([] + [hy.models.Symbol('a')] + [hy.models.Integer(5)])
True
=> (= 5 '5)
5 == hy.models.Integer(5)
True

Is there some other edge case that makes this a problem?

Kodiologist commented 3 years ago

That's the current behavior of the code, yeah. We haven't fixed it yet; we're just talking about fixing it.

allison-casey commented 3 years ago

what do we actually get by hy autoboxing/unboxing values in this way? Gonna have to write docs for the behavior at some point

Kodiologist commented 3 years ago
  1. Position information for error messages (start_line, end_line, start_column, end_column).
  2. Managing differences between syntax and the object represented by the syntax (e.g., a set literal can have the same element more than once).
allison-casey commented 3 years ago

any thoughts on names? The pr to close this is done otherwise. maybe canonicalize?

Kodiologist commented 3 years ago

My vote's for model. I would advise against something with "quote" in the name because it doesn't prevent evaluation, which is what quoting is about in Lisp.

Kodiologist commented 3 years ago

Note also the pleasing symmetry between hy.model and hy.models.Whatever.

allison-casey commented 3 years ago

my only issue with hy.model is that its reads first as a noun and not a verb first unlike recquote or canonicalize. And as such, doesn't telegraph that its doing a value transformation as well. If it were something like modelize or modelify` or something i'd go with it

Kodiologist commented 3 years ago

Those sound fine. Or as-model.

allison-casey commented 3 years ago

as-model is good. i'll do the rename and open the pr today

allison-casey commented 3 years ago

looking over this again, is the only complaint of this equality the fact that symbols are equal to strings and compound types are equal to other compund types? If that's the case why not just make symbols only equal to symbols, leaving the other literals with the same equality, and compound types only equal to themselves?

Gilch's compound example (= [1 2 3] '[1 2 3]) ; => False so it seems like the only issue is that this is still True (= '{1 2 3} '[1 2 3]) ; => True

Kodiologist commented 3 years ago

It seems more consistent to me to make models only equal to other models of the same kind for all types, not just keywords, symbols, and Sequences.

Gilch's compound example (= [1 2 3] '[1 2 3]) ; => False

That's only because Sequence changed from being list-backed to tuple-backed. Now, (= (, 1 2 3) '[1 2 3]).

allison-casey commented 3 years ago

i should have been more specific. is there any reason that having equality hypothetically work like this

(and (= 1 '1)
     (= 1j '1j)
     (= 1.5 '1.5)
     (= b"1" 'b"1")
     (= :key ':key)
     (= [1 2] '[1 2])
     (= {1 2} '{1 2})
     (= #{1 2} '#{1 2})

     (!= 'asym "asym")
     (!= (, 1 2) '[1 2])
     (!= '[1 2] '{1 2})
     (!= '{1 2} '#{1 2}))

is a bad thing? That models are only equal to models of the same kind or their equivalent python literal (if it exists, Expression and Symbol have no equivalent literal) and nothing else. i.e. hy.models.List is only equal to other hy.models.List's or list's etc

Kodiologist commented 3 years ago

One problem is that '{1 2 1 3} definitely shouldn't equal '{1 3}, but you're proposing that they'd both be equal to {1 3}, so then we break transitivity.

allison-casey commented 3 years ago

how would they both be equal? the elements are different

Kodiologist commented 3 years ago

Because '{1 2 1 3} would evaluate to {1 3}. I assume that by "equivalent python literal", you mean the object the model represents, i.e. evaluates to, right?

allison-casey commented 3 years ago

ahhh got it. Could we have Dict model equality evaluating against the elements in the unevaluated model not the evaluated dict. so dict equality would work on the sequence of items and not dict equality (= (list (chain #* (.items '{1 2 1 3}))) (list (chain #*(.items {1 3})))) ; => False

Kodiologist commented 3 years ago

Then (= {1 3} '{1 3}) but (!= {1 3} '{1 2 1 3}) even though the object represented is the same, which is confusing, not to mention the complications arising from Python dictionaries preserving order but the order not being used in equality testing.

Another source of trouble is that (= 1 1.0) but we don't want (= '1 '1.0), which would again lead to nontransitivity.

Syntax is not the same thing as the object represented by the syntax. A programmer who wants to test syntax according to what it represents can always hy.eval before testing.

allison-casey commented 3 years ago

dictionaries do kind of throw this whole thing out of wack and i don't see a way to handle that properly so i'll go ahead and finalize the PR to make them only equal to models of the same kind