dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.12k stars 1.57k forks source link

const canonicalization depends on which constructor is used #30876

Open jmesserly opened 7 years ago

jmesserly commented 7 years ago

this is explained in a comment in the const_ function (operations.dart)

  // TODO(jmesserly): there's no guarantee in JS that names/symbols are
  // returned in the same order.
  //
  // We could probably get the same order if we're judicious about
  // initializing fields in a consistent order across all const constructors.
  // Alternatively we need a way to sort them to make consistent.
  //
  // Right now we use the (name,value) pairs in sequence, which prevents
  // an object with incorrect field values being returned, but won't
  // canonicalize correctly if key order is different.

this is the reason that compile_time_constant_d_test and compile_time_constant_e_test are failing

Probably the way to fix this is to switch to a hashing/structural equality as noted in this TODO:

  // TODO(leafp): This table gets quite large in apps.
  // Keeping the paths is probably expensive.  It would probably
  // be more space efficient to just use a direct hash table with
  // an appropriately defined structural equality function.
leafpetersen commented 7 years ago

I also wonder whether we might do better to try to canonicalize before construction where possible. That is, for const, I think you should be able to index into a hash table using a constructor class ID, constructor name, and arguments, to avoid constructing the class in the first place. Of course, in the general case you might have to allocate a list of arguments instead, but this still might be cheaper, and you might be able to void this in many/most cases.

jmesserly commented 7 years ago

That is, for const, I think you should be able to index into a hash table using a constructor class ID, constructor name, and arguments, to avoid constructing the class in the first place.

note, canonicalization is based on field values (as evaluated at compile time) not the particular constructor called or its argument values (that's what this bug is about). Other than that, I agree.

Assuming we have a well defined field order (at compile time) we could lookup by TypeRep (JS class constructor) and an array of field values. We could also precompute the hash code, given the compile time object. So we'd generate:

// ... in our SDK ...
dart.const = function(typeRep, hash, values) {
  let map = typeRep[dart.constMap];
  let entries = map.get(hash);
  if (entries != null) {
    // search to see if we match any existing entries at this hash code
    // note: can use === for value comparison
    // note: entries is an array of pairs, each pair is the const object and the values array
    ... search for existing entry ...
  }
  // otherwise, construct class and insert new entry
  let object = new typeRep.const()
  ... add to hash table ...
  return object;
}

// ... compiled code for user's library ...
class MyClass { /* ... */ }
// generate a special constructor that uses the canonical value ordering
(MyClass.const = function(values) { ... initialize fields ... }).prototype = MyClass.prototype;
MyClass[dart.constMap] = new Map();
// ... in another module ...

dart.const(MyClass, 0x12345678 /*hash code*/, ['hi' /*field1*/, 42 /*field2*/ ]) 

I'm not sure about this approach, though. We still allocate the array of values. And we have to store the value array forever, to implement equality.

Precomputing the hash seems really helpful, at least.

Here's a possible improvement:

class MyClass {
  /* ... */
  static [dartx._constEq](x, y) {
    return x.field1 === y.field1 && x.field2 === y.field2;
  }
}
// ...
dart.const(0x12345678, new MyClass.const('hi', 42));
// ... or we could try this to see if it's faster (probably not)
dart.const(0x12345678,  { __proto__: MyClass.prototype, field1: 'hi', field2 : 42 });

We still have an allocation, but it's already valid as our constant object, unlike the array of values. We can try the object literal too, in case it's faster.

I think the only way to avoid an allocation would be to have a specialized version of dart.const for each class, that knows the number of fields to expect:

class MyClass {
  /* ... */
  static const(hashCode, field1, field2) {
    var values = map.get(hashCode);
    if (values != null) {
      // iterate through values and compare fields
      // note: we don't need to store pairs anymore, we can just iterate `values`
      // and look for a match
      // x.field1 === field1 && x.field2 === field2
    }
    // allocate and store and return a new instance
  }
}
// called via
MyClass.const(0x12345678, 'hi', 42);

All of these have a code size hit compared to the current approach, but probably worth it.

Final note: rather than hash codes, we could compute a unique key of some sort, perhaps using strings. This would essentially be a compile time serialization of the constant value, thus giving us a canonical string key to use for lookups. I'm not sure how well JS handles very long string keys; the resulting performance may not be better or may have unintended side effects. But it's another possibility.