lunarmodules / luassert

Assertion library for Lua
MIT License
202 stars 76 forks source link

`same` does not deepcompare keys #193

Closed RiskoZoSlovenska closed 1 year ago

RiskoZoSlovenska commented 1 year ago

The following fails, but I would not expect it to and I haven't seen it listed as a limitation anywhere:

local assert = require("luassert")
assert.are.same({ [{}] = 1 }, { [{}] = 1 }) -- lua: rep.lua:2: Expected objects to be the same.

I'm not sure a feasible fix exists given the fact that there's no easy way to tell which key to compare with which, but it would be nice to at least have a note about this limitation in the README.

alerque commented 1 year ago

You're using a table as a key, and you create two new tables to do so. That causes Lua to have two different table IDs, so no those two things are not the same.

The only way they would be the same is if you used the same table as a key in both cases so it had the same ID:

local assert = require("luassert")
local t = {}
assert.are.same({ [t] = 1 }, { [t] = 1 })

If you want to deep compare two tables you can do that too, but using two different tables that have different ids will by definition function as a different key, so luassert is doing the right thing. Lua doesn't identify your key as "an empty table", it identifies it as "this table by id". Empty or not, the key ends up being the table reference.

RiskoZoSlovenska commented 1 year ago

I'm aware that different tables == different keys, but what I'm trying to say is that I would expect same to compare its two inputs fully semantically, including table keys. In other words, if { [1] = {} } and { [1] = {} } can be considered to be the "same" table, then one could argue that { [{}] = 1 } and { [{}] = 1 } should be considered "same" as well.

I know that any implementation which would try to compare tables solely semantically (perhaps by deterministically stringifying each table and then comparing?) would most likely be wholly impractical, but I still think that the current behaviour warrants a note about table keys.

alerque commented 1 year ago

In other words, if { [1] = {} } and { [1] = {} } can be considered to be the "same" table, then one could argue that { [{}] = 1 } and { [{}] = 1 } should be considered "same" as well.

This doesn't make any sense. You still seem to have in your head that the array key is functionally an empty table. It is not. Having an empty table as a value will functionally be the same as any other empty table, you could serialize it and deserialize it and have it be the same thing. That isn't true for using a table as a table key. When you use a table as a table key you're functionally setting the key to the hex identifier of the table. It is just syntactic sugar for this:

local dump = require("pl.pretty").dump

local a, b = {}, {}
local id_a, id_b = string.format("%p", a), string.format("%p", b)

-- These are *almost* the same, but for the actual tables Lua retains
-- the meta data that it is a "table with id %p", while my pseudo implementation
-- is just the hex string.
local t1 = { [a] = true, [b] = true }
local t2 = { [id_a] = true, [id_b] = true }

dump(t1)
dump(t2)

You'll get something like this:

$ lua foo.lua
{
  ["table: 0x556e72ac4ab0"] = true,
  ["table: 0x556e72ac4af0"] = true
}
{
  ["0x556e72ac4ab0"] = true,
  ["0x556e72ac4af0"] = true
}

Note that the table keys are different and serializing and deserializing them shows the actual substance of the table, the key itself, is fundamentally different.

I don't think assert should be expected to return something functionally different from what using your code in Lua is doing under the hood.

RiskoZoSlovenska commented 1 year ago

In other words, if { [1] = {} } and { [1] = {} } can be considered to be the "same" table, then one could argue that { [{}] = 1 } and { [{}] = 1 } should be considered "same" as well.

This doesn't make any sense.

I disagree. { [{}] = 1 } is a "table that has an empty table as a key and the number one as a value". Any other { [{}] = 1 } also has the same semantic identity, and the point I'm trying to make is that I'd expect same to compare tables by what they represent, not by what the hashes of their contents are.

{} is an empty table, no matter whether it's acting as a key or as a value - Lua sees it as a pointer address in both cases (which is why you need functions like deepcompare to compare tables in a meaningful way). By deep-comparing tables at all, assert already is doing something different than what Lua is doing under the hood. I see no reason other than "it's difficult to implement, and thus a known limitation" for assert to deep-compare values but not keys.

When I mentioned serialization as a possible (albeit impractical) implementation, I meant one where you serialize tables into a full representation of their content with no shortcuts: { [{}] = 1 } would simply serialize into "{[{}]=1}" and you'd be able to compare tables that way. But nevermind that; it's not really relevant to what I'm trying to say.

alerque commented 1 year ago

I meant one where you serialize tables into a full representation of their content with no shortcuts: { [{}] = 1 } would simply serialize into "{[{}]=1}" and you'd be able to compare tables that way.

Your "no shortcuts" scheme is itself a shortcut. How would you serialize this: { [{}] = 1, [{}] = 2 }? According to Lua this is a valid table with two unique keys. Your "full representation" would clobber one of the entries out of existence.

RiskoZoSlovenska commented 1 year ago

How would you serialize this: { [{}] = 1, [{}] = 2 }?

As "{[{}]=1,[{}]=2}". But that's beside the point.

alerque commented 1 year ago

As "{[{}]=1,[{}]=2}". But that's beside the point.

No it isn't beside the point, it is my point. According to your expectations for same() you would have an invalid table because two keys are the same. In other words per your schema you would have a table with only one key with a value that got clobbered. Yet here your would-be serializer is doing something different than your sameness check. You can't have it both ways—and as far as I can make out we're following more along the lines of what Lua actually does.

If you still feel otherwise feel free to write a PoC assert that does what you want and drop a PR along with some tests.

RiskoZoSlovenska commented 1 year ago

Here you go: https://github.com/lunarmodules/luassert/pull/194

The whole serialization thing is beside the point; I only mentioned it as one possible implementation. There's different ways to solve this — see the PR for another. Apologies if my previous messages have been misleading about that.

Yet here your would-be serializer is doing something different than your sameness check.

Just for the record, they are the same; the "sameness check" would just have been "serialize both values and then compare the serialized strings". I had thought this would be more performant than checking every combination of pairs (see the PR), but after seeing the code for the unique assertion, I realized that I probably shouldn't be worrying about performance too much.

Tieske commented 1 year ago

@RiskoZoSlovenska thx for the PR. I'm a bit torn here, I can see both points.

As for the PR, it should not be updating the existing assertion, if anything, it should add a new assertion. wdyt @alerque ?

RiskoZoSlovenska commented 1 year ago

Perhaps the behaviour could be behind a boolean flag param, similar to how it is with unique?