Order dependence for equality

bakkot commented 5 years ago

From the readme:

if they have the same values stored and inserted in the same order, they will be considered strictly equal

This is surprising to me. Contrast python dicts or Purescript records.

It's also just a weird path-dependence in values: I would not expect it to matter if I construct something by doing (const { a: 0 }) with .b = 1 or (const { b: 1 }) with .a = 0

rricard commented 5 years ago

I got some other negative feedback on the order. Basically it's there to be "mindful" of the engines so we have something that remotely looks like the hidden class implementation optimization. But I think that last consideration could be brought up later... I'm gonna make an another pass on the proposal text at some point soon I'll ping that issue when I do!

tolmasky commented 5 years ago

Another interesting consideration is that I expect object with .key = object.key to be a no-op. However, if key ordering matters, then wouldn't object with .key push key's "order" to last, and thus (object with .key = object.key) !== object in many occasions?

rricard commented 5 years ago

If order matters then yes, and I agree it'll generate a lot of confusion. I was trying to be mindful of the engine before the user, it'll get updated!

Jamesernator commented 5 years ago

I think it would be better if instead for const objects enumeration order was in alphabetic order (or at least a consistent ordering). e.g. Object.keys(const { x: 10, y: 20 }) would be the same as Object.keys(const { y: 20, x: 10 }).

ljharb commented 5 years ago

Sets, Maps, and objects (roughly) all follow insertion order for enumeration - that seems the precedent to follow.

Insertion order isn’t likely to matter when comparing two Sets in the Set method proposal - so I’d assume that’s the precedent to follow for comparison too.

Jamesernator commented 5 years ago

If we did have Object.keys returning different sequences for those two objects but === still holding do we think this is a bad thing?

There's a related (but inverse) example in the language already where NaN !== NaN despite the fact NaN will act identically in all cases.

For this case it'd be the opposite way, two objects could be === but not behave identically in all situations. In which case would we need Object.is(const { x: 10, y: 20 }, const { x: 20, y: 10 }) to return false?

Jamesernator commented 5 years ago

To clarify we currently have these rules that hold true:

=== implies you can use either the LHS or RHS and the behaviour will be identical
- But !== doesn't imply the reverse because of NaN
Object.is implies two values are the same in all circumstances including NaN and you can pick either and get the same behaviour in all cases

But making const { x: 10, y: 20 } === const { x: 20, y: 10 } would break the first implication. This might be okay but my feeling is we'd never want to break the Object.is implication as well though.

dead-claudia commented 5 years ago

Order dependence can be removed if you're willing to allow some extra complexity that ranges from O(n) best case, O(n log n) to O(n^2) worst case (depending on implementation), and average case being somewhere in the middle. It's worth noting that Clojure, which uses immutable hash maps for nearly everything, also uses an order-independent equality algorithm for its maps and sets and does not have any significant performance issues from doing that. (I don't think it even offers a native order-dependent equality algorithm for its hash map data type.) They do not guarantee order, but I'm not sure how much of an effect that has on their algorithm.

bakkot commented 5 years ago

@ljharb, you can't both have insertion order for iteration and make insertion order not matter for equality comparison. I'm with @Jamesernator - alphabetic order, possibly with nonnegative integer keys first, is the way to go.

(The set methods proposal isn't adding an "equals" method, so I'm not sure what you're referring to there, but in any case that would be a very different thing from two things being ===. The horrible edge case that is -0 aside, === implies that the left and right are the same value, which means they should not be distinguishable under any circumstances.)

dead-claudia commented 5 years ago

@bakkot You can't avoid being able to distinguish them as long as you can iterate them - even if the iteration order is undefined, you'll still be able to tell many === records apart by iteration order. And === doesn't always imply "same value":

NaN !== NaN, one of my biggest criticisms of IEEE-754.
An embedder might choose to expose a pointer to a JS object handle to JS code as a number, and if this handle happens to be a string, two pointers to a === string might still be !==. (It's a similar story with bigints.)

As for the stated semantics, it's intended to express non-coercing equality, not necessarily complete indistinguishability. Checking that is the goal for Object.is, not ===.

Edit: Filed #20 to ask about this.

bakkot commented 5 years ago

@isiahmeadows

You can't avoid being able to distinguish them as long as you can iterate them - even if the iteration order is undefined

Only if the iteration order is undefined, which it should not be. If it is defined purely in terms of which keys are present, then iterating would not allow you to distinguish them.

NaN !== NaN, one of my biggest criticisms of IEEE-754.

I'm familiar with equality semantics in JS, but "=== implies same value" does not mean "!== implies not same value".

An embedder might choose to expose a pointer to a JS object handle to JS code as a number

Technically this is not forbidden by the spec, but the spec should not be built on the assumption that engines are going to do so.

Checking that is the goal for Object.is, not ===.

The fact that === uses different semantics from Object.is is one of the worst parts of the language. We are not going to add any new values for which the two are not equivalent.

rricard commented 5 years ago

I think the way this is going to end up is the following:

Order will not matter for comparison
Enumeration is still in insert order

This might change or e reviewed later on as we get closer to actual implementation (if that proposal ever goes that far)

tolmasky commented 5 years ago

I feel that many of the listed goals for value types will not actually be upheld if that is the case. A lot of the stated goals I feel fundamentally rely on the idea that:

Given a pure function f, and value types A and B, if A === B, then f(A) === f(B)

However, if A and B can ===, but return different results for Object.keys() (another way of saying that their iteration order is different), then no guarantees can bee made about f anymore. As such, many algorithms that want to rely on being able to "early return" when data is cheaply compared to be equal, will not actually be able to rely on === to do so. As a trivial example, simply imagine a React view that creates a list from the keys and values of an object. If this iteration is not guaranteed, then you can't simply say "I don't need to re-render if old-model === new-model".

rricard commented 5 years ago

That's a good point: I removed all ordering considerations for now from the proposal and we can keep discussing in that issue

rricard commented 5 years ago

We decided to have a stable key ordering: @rickbutton is working on it now!

rickbutton commented 5 years ago

speaking of! #25

littledan commented 5 years ago

What is the sort order used for property keys? Cc @syg

ljharb commented 5 years ago

I believe that should be outlined here

littledan commented 5 years ago

@ljharb The explainer says that record fields are sorted, and it sounds like this is not done by the order they are present in the literal.

syg commented 5 years ago

I'm still worried that with a sorted order, === won't be easy to make cheap and fast with existing object representations in engines, and people in this space seem to really want cheap and fast deep equality between objects. The short-term worry is that the slowness of === hurts adoption.

I can envision how it can be made fast with different object representations, but the current hidden class / shape "linear history of property insertion" representation is pretty pervasive. Speculative optimizations are often checked for validity with a single shape pointer, for example.

It is attractive to have these const values things to like extensible record types from other languages, but that's a big departure from objects.

tolmasky commented 5 years ago

Without sort order we'll have a worse problem that prevents adoption: the comparison may be fast but it isn't useful (or at least, isn't any more useful than current objects). Per my example above, if you can't rely on f(a) === f(b) if a === b, then the utility of these objects decreases significantly. You can no longer rely on early return optimizations when a === b, so it becomes unclear what the benefit of these objects really is outside of "engine optimization" (as opposed to algorithm optimization that the users can employ). That is to say, maybe using immutable objects gets you "under the hood" speedups, but they can't really be safely reasoned about from the outside as being any safer than normal objects. In fact, a frozen mutable object is arguably "safer" in this regard than these immutable objects, because if frozen(a) === frozen(b), then (as far as I know) you can safely early exit, while you wouldn't with these immutable objects.

syg commented 5 years ago

I hear you. I also certainly don't advocate for a world where Object.keys order can differ for two === objects. Let's see if there's any engine pushback. If not, then great!

bakkot commented 5 years ago

I'm hopeful that engines would be able to implement this with just a tweak to how hidden class transitions work: the new hidden class can remain a function of the current hidden class and the new key, it would just be a different function.

=== would still be slower than it is for objects, because you have to check which values the keys have in addition to which keys are present, but I don't think the fact that keys are in sorted order should impose any additional slowness to === as long as you can make the hidden class transitions take order into account.

But this is definitely a question for implementors.

syg commented 5 years ago

@bakkot Agreed that the value checking isn't likely to matter here.

It also might just be a tweak that works out cleanly, that'd be great. To be concrete, it'd be nice if an initial implementation in current engines didn't incur a bunch of perf cliffs. For example, I'm worried of JIT code being invalidated due to "insertions" when creating new immutable objects, because their hidden class has a different transition than just "append".

tolmasky commented 5 years ago

My apologies and I do not mean to be argumentative, but I think our position vis a vis implementors should be that this is a deal breaker -- in other words, I think this feature isn't really worth it without sorted keys. My reasoning is that without sorted keys, the "best practices" around using immutable objects becomes incredibly sophisticated and subtle. In fact, I believe that "immutable objects" becomes one big expert-level gotcha. As a framework writer, I know that I have to essentially ignore their existence with user-supplied objects for the reasons listed above, but as a library user, even for something as simple as a memoizer, I have to hope that the library author is very aware of these issues. It's already the case that many memoizers fail with existing gotchas (for example, many erroneously believe they already have memoized results for "hasOwnProperty" or other keys that exist on Object's prototype), this would add yet another layer of submarine bugs.

The simplest proof-of-concept case of course is just memoize(Object.keys). Although this memoized function does not have a tremendous amount of utility, it does highlight the difficulty of implementing such a memoizer and the contradictory nature of the situation: if insertion order matters, then the memoizer should return the insertion-order listing of the keys -- but it will very likely fail to do so for indistinguishable "different" objects.

As I mentioned earlier, without sorted key I'd actually recommend frozen objects to users instead, where the expectations are more straight-forward -- it is not a "content addressable" object, it is merely a non-changeable object, where === works exactly the same as in all other contexts (that is to say, two "structurally identical", but different, objects are always !==). I don't mean this as a cop-out, but rather saying that we have an existing feature that is easier to understand (and more consistent in my opinion) for what this proposal would provide without sorted keys.

dead-claudia commented 5 years ago

I do have a few questions about sort order:

How should integer index and symbol properties fit in relative to strings? I feel integer index properties should come first, symbols should come last. (Symbol order can come into play with Object.getOwnProperties.)
How should integer index properties be sorted relative to each other? The obvious choice is just lowest to greatest.
How should strings be sorted relative to each other? The obvious choice is just lexicographic order.
How should symbols be sorted relative to each other? You could do insertion order, but const {[sym1]: "foo", [sym2]: "bar"} === const {[sym2]: "bar", [sym1]: "foo"} is something you would expect to evaluate to true, and so you would require either a total ordering of symbols for immutable objects or a global hash map tracking key order for each set of keys.

The last one would encounter a major slowdown, but about the best way I can think of for that, without creating substantial perf loss, is to keep an ordered hash map of symbol keys and GC references, and:

When a new symbol is not previously used, append it to the set.
When a symbol is removed from the last immutable object or if all remaining immutable object references are collected, remove it from the set.
The key index of a symbol represents the sort order.

This would have to be specified as an implementation-dependent hook, tied into weak references as it has similar security and observability concerns. One could check, using a local const testSym = Symbol(), that if a given key sym is used in any immutable objects by checking the output of Object.keys(const {[testSym]: "foo", [sym]: "bar"}). It returns [sym, testSym] if such an object exists with a reference to that, [testSym, sym] if not. However, without the GC hook, it amounts to effectively a required memory leak, something that itself IIRC was a past issue with tagged template literals, specifically with the template arrays themselves, and it would be especially annoying on long-running server-side processes.

This overhead is also going to be an issue on low-resource IoT environments (e.g. Moddable, Tessel) and similar environments with stringent resource restrictions.

tolmasky commented 5 years ago

Regarding Symbol ordering, perhaps this went without saying, but I think you can avoid the complex cases a large part of the time by stating that you first sort on Symbol.description, and only if those are equal, do you then fall back to the fancy techniques described above. This means that all the common cases of merely wanting to include fairly normal Symbols (which often already go through a great deal of trouble to self-uniqueify for the purposes of backwards compatibility in transpiled cases) should be no less performant than string keys. I believe it would be safe to make this a spec requirement (that is to say, maybe we leave the case of Symbol("x") !== Symbol("x") implementation dependent to find an ordering, but Symbol("a") always < Symbol("b")).

From an implementor's perspective, it might be possible to create further simplifying cases. For example, if source/position of a Symbol were kept at creation time, this would also significantly reduce the cases of otherwise identical Symbols. Symbol("a") [from a.js:100] < Symbol("a") [from b.js:4]. You could of course imagine an even more complicated stack-based approach to handle "symbol factories", etc. leading to only truly pathological cases like [1,2,3].map(()=>Symbol()) requiring distinguishing (even then, you could maybe slap a timestamp on them?).

That being said, I think it might be alright to keep per-object (partial) insertion order for Symbols with identical descriptions in objects. So:

sym1 < sym2 in object if sym1.description < sym2.description or insertion_order(sym1, object) < insertion_order(sym2, object)

That is to say, I think it might not be unreasonable to say that:

const a = Symbol("a");
const a2 = Symbol("a");
const b = Symbol("b");

#{ a: 1, b: 2 } === { b:2, a: 1 }
#{ a: 1, a2: 2, b: 2 } === { b:2, a: 1, a2: 2 }

// BUT:

#{ a: 1, a2: 1, b: 2 } !== { a2: 1, a: 1, b:2 }

This certainly does not break any promises of content-addressable objects: for every case that a === b, f(a) === f(b). In the third case above, a !== b, and f(a) !== f(b) (case in point when f === Object.getOwnPropertySymbols). (For the record, even if f(a) === f(b), this would still not break any promises as it is not an iff). To further clarify, we could have made per-object insertion order matter across the board for ALL keys as a way of trivially maintaining the if a===b then f(a) === f(b) rule, but that would reduce the utility of immutable objects, hence wanting to employ sort in as many cases as possible. Here we are simply stating that in the very rare case of objects with Symbols that have identical descriptions, but are inserted in opposite order, they represent distinct objects.

dead-claudia commented 5 years ago

@tolmasky Didn't think about sorting lexicographically by .description. That certainly gives a nice ~~total~~ Edit: partial order without the overhead, but yeah, symbols with identical names are a problem.

dead-claudia commented 5 years ago

I feel non-array immutable objects should be compared via unordered equality, basically this:

// Basic spec
const _hole = {}
const _get = (o, k) => {}.hasOwnProperty.call(o, k) ? o[k] : _hole
function constObjectEqual(a, b) {
    if (Object.getPrototypeOf(a) !== Object.getPrototypeOf(b)) return false
    const aKeys = Reflect.ownKeys(a)
    const bKeys = Reflect.ownKeys(b)
    if (aKeys.length !== bKeys.length) return false
    const aSet = new Set(aKeys)
    for (const key of bKeys) {
        if (!aSet.has(key)) return false
        const aDesc = Reflect.getOwnPropertyDescriptor(a, key)
        const bDesc = Reflect.getOwnPropertyDescriptor(b, key)
        if (aDesc == null) return false
        if (bDesc == null) throw new TypeError()
        if (
            _get(aDesc, "get") !== _get(bDesc, "get") ||
            _get(aDesc, "set") !== _get(bDesc, "set") ||
            _get(aDesc, "value") !== _get(bDesc, "value") ||
        ) return false
    }
    return true
}

// Baseline optimized: assumes in-order first, but equivalent to above
const _hole = {}
const _get = (o, k) => {}.hasOwnProperty.call(o, k) ? o[k] : _hole
function constObjectEqual(a, b) {
    if (Object.getPrototypeOf(a) !== Object.getPrototypeOf(b)) return false
    const aKeys = Reflect.ownKeys(a)
    const bKeys = Reflect.ownKeys(b)
    if (aKeys.length !== bKeys.length) return false
    for (let i = 0; i < aKeys.length; i++) {
        if (aKeys[i] === bKeys[i]) {
            const aDesc = Reflect.getOwnPropertyDescriptor(a, aKeys[i])
            const bDesc = Reflect.getOwnPropertyDescriptor(b, aKeys[i])
            if (aDesc == null || bDesc == null) throw new TypeError()
            if (
                _get(aDesc, "get") !== _get(bDesc, "get") ||
                _get(aDesc, "set") !== _get(bDesc, "set") ||
                _get(aDesc, "value") !== _get(bDesc, "value") ||
            ) return false
        } else {
            const aKeySet = new Set()
            let j = i
            do {
                aKeySet.add(aKeys[j])
            } while (++j !== aKeys.length)
            do {
                if (!aKeySet.has(bKeys[i])) return false
                const aDesc = Reflect.getOwnPropertyDescriptor(a, bKeys[i])
                const bDesc = Reflect.getOwnPropertyDescriptor(b, bKeys[i])
                if (aDesc == null || bDesc == null) throw new TypeError()
                if (
                    _get(aDesc, "get") !== _get(bDesc, "get") ||
                    _get(aDesc, "set") !== _get(bDesc, "set") ||
                    _get(aDesc, "value") !== _get(bDesc, "value") ||
                ) return false
            } while (++i !== bProps.length)
            break
        }
    }
    return true
}

It looks complicated and slow for the above algorithm, but it's easily optimized for most common cases:

If both operands share the same type map, you can just compare the values and skip the map altogether.
You can start iteration assuming keys are in the same order and check for equality, and bail out to the hash map-based slow path only if the keys themselves differ. This is implemented in my baseline-optimized version above.
If neither object is a proxy, you could match the tail in order after the head fails, too, and it'd still just work. This also saves some time and lets you avoid an allocation with minimal runtime cost.

littledan commented 5 years ago

Seems like we're approaching agreement here that we'd like a total ordering on property keys, in order to enable records to have "sorted" keys, which lets === return true even if the insertion order is different, while maintaining expected lack of observable differences between values which are === (apart from a fixed set of legacy floating point exceptions). I like the framework proposed by @isiahmeadows that we start by imitating ordinary objects, with integer indexed property keys, then strings, then symbols. Strings can be sorted in code point lexicographical order, and the hard part is symbols.

A simple mechanism to create a total order on symbols (within an agent) would be to have a per-agent incrementing counter, starting at 0, with each symbol getting a unique counter value. Symbols are compared according to the counter. This would be cheap as well as simple. Agents are single-threaded, so there is no ambiguity about how the counter would work.

Would such a mechanism constitute some kind of information leak? I don't quite see what information it'd be leaking; it seems fine to me. If people do have concerns, do they have any ideas for alternatives? cc @erights

dead-claudia commented 5 years ago

@littledan

A simple mechanism to create a total order on symbols (within an agent) would be to have a per-agent incrementing counter, starting at 0, with each symbol getting a unique counter value. Symbols are compared according to the counter. This would be cheap as well as simple. Agents are single-threaded, so there is no ambiguity about how the counter would work.

An atomic counter would still be reasonably quick. In Intel, getting such an ID is two instructions, mov reg, 1 + lock xadd [addr], reg, and in ARM, it's three: ldrex + add rN, rN, 1 + strex. And although those do decode to more micro-ops, the cost of those are dwarfed by all the other work you'd need to do anyways.

It might be necessary to require an atomic counter if symbols are ever exposed outside the agent that created them. This could potentially be required if symbol keys are retained in const objects sent over postMessage or similar.

Separately, if this helps inform anything:

Clojure's iteration order is insertion order (I believe) for up to 8 elements, undefined for greater.
Immutable.js's Map iteration order is undefined, but stable over referential equality.
Most current implementations, including both above, use persistent hash array mapped tries, but there's still active research out there attempting to improve upon that. HAMTs have an iteration order based on internal object hashes, and for obvious reasons, this shouldn't be exposed to JS.

rickbutton commented 5 years ago

@isiahmeadows Symbols cannot be communicated via postMessage because the structured clone algorithm does not allow Symbols.

dead-claudia commented 5 years ago

@rickbutton I'm aware of this. I'm saying hypothetically if this restriction were to be lifted for whatever reason.

rickbutton commented 5 years ago

ah, yes! understood @isiahmeadows

bakkot commented 5 years ago

A simple mechanism to create a total order on symbols (within an agent) would be to have a per-agent incrementing counter, starting at 0, with each symbol getting a unique counter value. Symbols are compared according to the counter.

I think we should first emit all signals which have descriptions according to code point lexicographical order of the description, breaking ties using this counter. Using descriptions first also means we don't have to specify a creation order for all of the built-in symbols, as long as we ensure they all have unique descriptions.

Also, I think we would need to specify that Symbols created with Symbol.for must come before other Symbols (or, if we sort by description first, other Symbols with the same description). Otherwise Symbol.for could be used as a side channel: I could determine if anyone has ever called Symbol.for('a') by observing the iteration order of #{ [Symbol('a')]: 0, [Symbol.for('a')]: 0 }.

A global counter does introduce new mutable global state, so I'm not sure if it would fly with @erights. I don't think you can build a side channel out of it in the traditional sense (given the above tweaks), but you can build a kind of sneaky communications channel: Alice makes two symbols and passes them to Bob, and Bob can derive one bit of information by discovering (using this mechanism) which was created first. That's... probably OK?

dead-claudia commented 5 years ago

@bakkot

I think we should first emit all signals which have descriptions according to code point lexicographical order of the description, breaking ties using this counter. Using descriptions first also means we don't have to specify a creation order for all of the built-in symbols, as long as we ensure they all have unique descriptions.

That could work for Symbol.for, but how would it work for this?

const symA = Symbol()
const symB = Symbol()

I see stuff like that a lot in the wild, symbols without descriptions.

Also, what about something like this?

// Suppose a class like this, but doing something a lot more involved.
class Mirrored {
    constructor(file) { this._file = file; this._data = Symbol("data") }
    async init(v) { return #{...v, [this._data]: /* get data from file somehow */} }
    get(v) { return v[this._data] }
}

const SomeKey = new Mirrored("/path/to/file")
const AnotherKey = new Mirrored("/path/to/another/file")

// Later, initialize
const object = await AnotherKey.init(await SomeKey.init({
    // whatever data happens to be relevant
}))

// And later, iterate for some reason
for (const [key, value] of Object.entries(object)) {
    // do stuff...
}

Edit: Prematurely sent

bakkot commented 5 years ago

@isiahmeadows, as I said, you would break ties using the counter.

dead-claudia commented 5 years ago

@bakkot Sorry, missed that bit.

bakkot commented 5 years ago

An alternative solution to the Symbol problem is to require that iteration order of undistinguishable Symbols be randomized every time it is queried.

tolmasky commented 5 years ago

Guaranteed randomization would once again break one of the more useful properties of content-addressable objects that f(a) === f(b) if a === b. Essentially, you'd have a strange immutable object that represents a truly random variable. To implement some sort of bizarro coin toss algorithm:

const heads = Symbol();
const tails = Symbol();

const coin = #{ [heads]: heads, [tails]: tails };

function toss(coin) { 
   return Object.getOwnPropertySymbols(coin)[0] === Object.getOwnPropertySymbols(coin)[0] ? heads : tails;
}

tolmasky commented 5 years ago

Note that the above would return a random answer only with immutable objects, but strangely returns a consistent answer with normal javascript objects (which would base getOwnPropertySymbols on insertion order).

dead-claudia commented 5 years ago

Is there a reason insertion order can't be used like it is for other objects? One would think that would be efficient enough, plus it doesn't require a whole new line of engine optimizations to account for them. Plus, it's worth noting that modern strongly-typed functional languages don't use HAMTs behind the scenes either for record types - they just use C-style structs with a possible pointer to type info if necessary.

Edit: I mean iterating in insertion order here, not just with ===.

Edit 2: Just ignore this bit. It's off-topic.

tolmasky commented 5 years ago

Perhaps I misunderstand the question, but the original reason for this bug is that insertion order can't be used for === (or iteration) because it would cause seemingly identical objects to have different properties, or alternatively force objects to not be === depending on insertion order.

Unless we mean "global insertion order" and this is just referencing "the counter" again?

If the question is just "insertion order" though, then we're right back to the problem of:

(const { a: 0 }) with .b = 1 !== (const { b: 1 }) with .a = 0

Or, if they are ===, then breaking f(a) === f(b) when a === b (since iterating keys of a and b would be different).

dead-claudia commented 5 years ago

@littledan

I like the framework proposed by @isiahmeadows that we start by imitating ordinary objects, with integer indexed property keys, then strings, then symbols. Strings can be sorted in code point lexicographical order, and the hard part is symbols.

That's unobservable. You can't do new Proxy(constObject, ...) because const value types are to my understanding not objects in the way ordinary objects and function objects are.

This issue was initially asking if #{a: 1, b: 2} === #{b: 2, a: 1} should evaluate to true or false.

true means order doesn't matter, which is what I want.
false means order does matter, which is what gave rise to the question of "what's the order?".

I feel this distinction got lost in the discussion, and I'm hoping I could drive this back on topic.

tolmasky commented 5 years ago

So, I think we are agreed that #{a: 1, b: 2} === #{b: 2, a: 1} should evaluate to true, but I guess (at least to me), that means that order does matter in the sense that we need them both to return the same ordering of keys and symbols in order for the === to be meaningful. If they === but behave differently then it's not a full resolution to the problem.

dead-claudia commented 5 years ago

@tolmasky

[...] but I guess (at least to me), that means that order does matter in the sense that we need them both to return the same ordering of keys and symbols in order for the === to be meaningful.

This is incorrect. Key order is unobservable over equality if #{a: 1, b: 2} === #{b: 2, a: 1} evaluates to true, as I pointed out in my last comment. If it weren't for the fact you could iterate the properties of const objects, the key order wouldn't even need spec'd and would just be an implementation detail.

Keep in mind, if key order is not important, only the collection of properties, it would be a perfectly reasonable implementation to use a (HAMT, List<HAMTNode>) pair to represent const objects. The first, a hash array mapped trie, holds all the values, and the second, a list of child HAMT nodes in order of insertion, holds the set of properties. As long as the HAMT and property key list remain aligned, equality is just HAMT equality, a reasonably fast thing to check. Iteration will require iterating the HAMT node list to iterate the properties correctly because the HAMT is ordered by hash value, but that's the only thing that exists for.

bakkot commented 5 years ago

@isiahmeadows The representation is not really the interesting question at this point. The questions of interest are

Is #{a: 1, b: 2} === #{b: 2, a: 1}?
Is the array given by Object.keys(#{a: 1, b: 2}) in the same order as the array given by Object.keys(#{b: 2, a: 1})?
If so, what order is that? (There's an obvious order for string keys, but not for symbols with the same .description.)

I and several other people in this thread have expressed the opinion that the answer to 1 should be "yes", and that if the answer to 1 is "yes" then the answer to 2 must be as well. That leaves 3 as the remaining question with no obvious answer. Are you intending to disagree with those answers to (or dispute the premise of) 1 or 2? Are you proposing a solution for 3? I can't quite tell.

tolmasky commented 5 years ago

But you can iterate over the properties and thus we need to define an order so as to not have a situation where calling Object.keys() on two “theoretically” equal (===) values returns two differently sorted lists, thus creating a side channel for determining that they are in fact not equal.

Unless what you mean is that we should not allow iterating over the keys/symbols.

dead-claudia commented 5 years ago

@bakkot Oh okay. I'm of the opinion 2 should be "no", just to keep it consistent with objects, and if necessary, Object.is(#{a: 1, b: 2}, #{b: 2, a: 1}) can also return false. Already, === doesn't imply both sides are observably identical and !== imply observably different (consider: +0 vs -0, NaNs).

erights commented 5 years ago

For registered symbols, if they are allowed at all, they must only be sorted by their string name. Any proposal that allows unregistered symbols must sort these separately from registered symbols.

If you rank unregistered symbols by incrementing a count, it would certainly be unsafe if the count itself were exposed. Between two subgraphs that should not be able to communicate, they would still be able to via this globally shared counter.

I see two possibilities:

Certainly safe: Why do value types need to include unregistered-symbol-named properties? What if we just don't include those?
Likely unsafe: What if did indeed rank all unregistered symbols created within an agent, by order of creation, so they had a sort order based on their rank. Note that sorting by rank exposes only relative rank. Between two unregistered symbols that Alice has, she has no power to observe the number of unregistered symbols created that she does not have.

The info leak exposed by the second bullet above is at least hard to think about. It is certainly safer than exposing a count. OTOH, it is also introducing global mutable state that is way more observable than anything currently in the language. Before investing any more time in it, why not do the first bullet above?

I propose that values do not allow properties named by unregistered symbols. Those named by registered symbols are sorted separately, and only by the string name of those registered symbols.

tc39 / proposal-record-tuple

Order dependence for equality #15