dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.
https://dart.dev
BSD 3-Clause "New" or "Revised" License
10.08k stars 1.56k forks source link

JS Interop types have `.hashCode == 0` #55823

Open huanghui1998hhh opened 3 months ago

huanghui1998hhh commented 3 months ago

When call this function here, it throw this error(only on WASM). I'm trying to build a minimal reproduction, but i failed. At first i guess this problem is caused by extension on extension type, but seem not. My demo run well, so i can only provide the permalink.

mit-mit commented 3 months ago

@mkustermann does this look like a compiler or interop problem to you?

osa1 commented 3 months ago

I'm trying to build a minimal reproduction, but i failed

@huanghui1998hhh If you share the repro you have (code + build instructions) we may be able to minimize it.

mkustermann commented 3 months ago

@huanghui1998hhh Could you build your app with -O0 (e.g. flutter build web --wasm -O0) and report back if you get a different error?

Otherwise I think we'll need to get a reproduction (be it minimal or not) under an open source license, so we can reproduce and debug it.

(Side note: Looking at the code, it seems to have final xhrs = <web.XMLHttpRequest>{};, i.e. a set of JSObject - which on wasm will be a js.JSValue. That will forward equality to JS but the hash code is unconditionally int get hashCode => 0 - which would be terrible for large maps/sets. @srujzs @sigmundch Should we consider disallowing hashing those objects?)

huanghui1998hhh commented 3 months ago

Sorry, my bad. Build with -O1, i found the real reason is xhr.response as ByteBuffer, this will not throw error in dart2js. Maybe need to show TypeError explicitly on higher optimizations.

sigmundch commented 3 months ago

xhr.response as ByteBuffe

Thanks for sharing your example. Good news is that we will have something to help with these issues in the future. @srujzs created a lint that will warn about casts that are inconsistent between dart2js and dart2wasm. The lint will be available with Dart 3.5 (now available on the dev channel, but hopefully soon will be released in the beta channel)

Should we consider disallowing hashing those objects?

Good question. We accidentally do better in dart2js. We have the same behavior (hashcode == 0) for pure JS objects. However, anything in dart:html has a better hashcode on JavaScript backends. We use an expando with a private symbol key to store a hash on each instance. As a result, if dart:html is not tree-shaken or removed via --server-mode, JSObjects from package:web that correspond to those intercepted types happen to a good hashcode :open_mouth:. The reality is that wasm compatible apps will be using only package:web and dart:html will be tree-shaken, and eventually we do hope to remove dart:html. As a result, we should already consider that all of the package:web types have or will eventually have hashcode 0.

I see two options here:

We don't want to do (a) for all JS types, but only for some. For example, we need to exclude primitives and I'd like to also exclude JS object literals (where adding extra properties can be disruptive). On dart2js, we have the means to do some introspection (e.g. lookup an interceptor) to discern and take a specialized approach, but I don't believe that's practical in dart2wasm.

TL;DR - I sense we may need to settle with (b).

@mkustermann @srujzs - do you agree?

srujzs commented 3 months ago

The lint will be available with Dart 3.5 (now available on the dev channel, but hopefully soon will be released in the beta channel)

The lint name is invalid_runtime_check_with_js_interop_types whenever it's available. It should warn in this case.

We use an expando with a private symbol key to store a hash on each instance.

I might be looking at the wrong code, but from debugging, it looks like we just directly compute a hash and store it on the instance and not use an expando. DDC is the same, except for arbitrary JS objects, DDC still computes a hash and stores it whereas dart2js always returns 0.

(a) expand when we use the expando approach, so it doesn't depend on the presence of dart:html

Do you mean compute a unique hash code for every JS object in dart2js regardless of whether it's a type intercepted in dart:html? That would make hashing arbitrary JS objects in dart2js faster, but the performance issue still remains in dart2wasm. I suppose we could do the opposite and see that if dart:html is never imported, then always return 0 for hash codes so dart2js and dart2wasm are consistent at least.

One idea might be to compute a unique hash code for an arbitrary JS instance and cache it in the JSValue on the first call to hashCode. It'll be a single extra interop call, and would be consistent even if that JS object is then wrapped by a different JSValue.

Should we consider disallowing hashing those objects?

If we were to disallow it, it might not be discoverable until runtime in some cases e.g. upcasting a JSValue to Object and then caching it. With the way dart2js and DDC work today, this would also be a huge breaking change. Even if we changed it to be 0 in dart2js and DDC, I'm sure some google3 apps would notice the difference.

(b) keep it 0 and warn about this performance cliff (maybe via a lint)

We may not be able to lint in all cases, but it may be doable. At the very least, I should document this somewhere.

sigmundch commented 3 months ago

store it on the instance and not use an expando.

Sorry, I didn't mean a Dart expando, but a property in JavaScript. When the extra property is added to a native object that was not not originally part of the prototype is also called an expando too. Sorry for the confusion :confused:

Do you mean compute a unique hash code for every JS object in dart2js regardless of whether it's a type intercepted in dart:html?

Correct, modulo excluding primitives and object literals (since we don't want to add a property to those).

if dart:html is never imported, then always return 0

This happens today! For example, this program prints 0, but if you uncomment the dart:html import, it prints a non-0 hash:

import 'package:web/web.dart';
// import 'dart:html';

main() {
  print(HTMLDivElement().hashCode);
}

(I thought we were smarter and also removed dart:html if it wasn't used, but given our conservative assumptions about how any external API could return a native symbol, we don't as soon as we see an import to it)

One idea might be to compute a unique hash code for an arbitrary JS instance and cache it in the JSValue on the first call to hashCode. It'll be a single extra interop call, and would be consistent even if that JS object is then wrapped by a different JSValue.

Do you mean caching it on both the JSWrapper and the JS object?

At first I thought you meant only a property on the JS object (which makes 2 wrappers have a consistent value.). If we do only that, it may be too expensive: it would require that get hashCode on JSValue first decide whether it has a cache or not. Since we can't add a cache to JS primitives, we would then need to also do a typeof test before we fetch the hash value from JavaScript. If we also don't want to add properties to object literals, this becomes even more complicated.

However, if we have second cache in the JSValue wrapper, then I can see this working more efficiently.

Even if we changed it to be 0 in dart2js and DDC, I'm sure some google3 apps would notice the difference.

Yeah, I wouldn't change the hash of existing objects, that would be a bad regression

We may not be able to lint in all cases, but it may be doable. At the very least, I should document this somewhere.

:+1: At least we should be able to recognize when the type parameter of a Set or Map is an interop type and warn about it in those cases.

srujzs commented 3 months ago

When the extra property is added to a native object that was not not originally part of the prototype is also called an expando too.

Hah, I knew I was missing something.

given our conservative assumptions about how any external API could return a native symbol, we don't as soon as we see an import to it

Makes sense. We'd need to revisit this if we want similar hashing as dart2wasm for apps that migrate away from dart:html. Maybe we could do something package:web-specific, so that any of the members in that package are assumed to not need dart:html. This could also address some of our concerns around users accidentally dynamically calling dart:html members even when they have no imports to dart:html. There might be some pitfalls I'm missing, though.

However, if we have second cache in the JSValue wrapper, then I can see this working more efficiently.

Right, I was thinking store the hash code after the initial computation in a field in JSValue and use a JS WeakMap so that the same JS value can get the same hash code even if they're wrapped by two different JSValue objects.

mkustermann commented 3 months ago

@srujzs created a lint that will warn about casts that are inconsistent between dart2js and dart2wasm. ... The lint name is invalid_runtime_check_with_js_interop_types whenever it's available. It should warn in this case.

Is this an opt-in lint or always on?

IMHO this should be always on. I'd even say that if we know such an <jsObject> as <non-JSObject> cast would always fail, then it should be a compile-time error (possibly a dart2wasm specific compile-time error - if we do want to keep dart2js code working that uses it).

(a) expand when we use the expando approach, so it doesn't depend on the presence of dart:html (b) keep it 0 and warn about this performance cliff (maybe via a lint)

TL;DR - I sense we may need to settle with (b). @mkustermann @srujzs - do you agree?

Currently any Dart object can be hashed and put into maps/sets because we have a) hash of a primitive value b) randomly distributed identity hash c) user defined get hashCode (where the user ensures it's appropriately distributed). This ensures amortized O(1) access time in maps/sets.

IMHO it's not acceptable to silently break this behavior by making all JS objects get a 0 hash code. Then the data structures users expect to be O(1) suddenly become O(n) without their knowledge. In fact they become slower than using a simple list.

So I'd argue the options are:

a) Disallow hashing of JS objects (make get hashCode throw): This will make users use lists instead of maps/sets which will give them better performance/memory than always-0-hashCode and will make that linear behavior also explicit in code.

b) We lazily attach an extra hash-code property on a JS object when needed (basically our identity hash code)

c) We use an identity hash code on the Dart wrapper of a JS object with the downside that the same JS object with two different wrappers will get different hashes (this will lead to difference in behaviors between dart2js & dart2wasm - so may not be a good idea).

Or do a) / b) / c) based on the type of the JS object.

Maybe adding someone from language team for opinions (/cc @lrhn @leafpetersen )

srujzs commented 3 months ago

Is this an opt-in lint or always on?

I'm trying to get it into the recommended set of lints: https://github.com/dart-lang/lints/issues/188. It's set up so that it's a lint on all platforms but ignores checks that involves interop types that can only be written in dart2js/DDC.

I think (b) is the way forward, although I'm not sure if it has to be a property on the JS object. Users only pay the cost of the interop call if they're hashing, and it's a one-time cost per object.

(a) would be too breaking in dart2js/DDC. We could do this only for dart2wasm, but then it's not discoverable and would be quite frustrating. I don't like (c) for the same reason I want to fix https://github.com/dart-lang/sdk/issues/55515.

lrhn commented 3 months ago

If we can't provide a non-aweful hash, I'd prefer to have hashCode throw. That's untraditional for Dart objects, but these are precisely not Dart objects, so I'd chalk it up to JS objects just being incompatible with hashing. Interop is a "best effort" approach to making JS values available to Dart code, with a semantics that is the best available combination of Dart and JS semantics.

If we can give some JS objects good hash codes (maybe JSNum, JSSTring) or are willing to use expandos on non-primitive values, we can also do that. Just returning 0 is the worst kind of failure mode: It looks like it works, but in reality your program will never complete at production scales.

mkustermann commented 3 months ago

/cc @rakudrama For how JS performance may be impacted if we lazily attach id hashes to JS objects.

sigmundch commented 3 months ago

My preference is to do something that is backwards compatible and doesn't break existing code (even if we ignore the new dart:js_interop). In dart2js, I don't think we can provide a different behavior to new interop types without affecting behavior of package:js types and native DOM types. Because of that, I'd prefer we don't start throwing on hashCode.

I think it's feasible to do (b) and store the hash in JS. It would be good to verify this strategy doesn't cause issues on the JS side. Some libraries are sensitive to modifications to objects in a public way. We had issues like these in the past when we stored an expando for List RTIs in the past. Hopefully, using a symbol key for the expando could be enough to avoid problems.

mrtnetwork commented 3 weeks ago

I encountered an issue where the following code works fine when running with flutter run -d chrome:

@JS("localStorage")
extension type LocalStorage._(JSObject _) implements JSAny {
  external factory LocalStorage();
}

Map<String, String> getAll() {
  return Map<String, String>.from(localStorage.dartify() as Map);
}

However, after building the web project with flutter build web, I received this runtime error:

Attempt to execute code removed by Dart AOT compiler (TFA)

To resolve the issue, I used the following code, which works correctly both during runtime and after building:

  Map<String, String> getAll() {
    final toDart = Object.entries(localStorage).dartify() as List;
    return Map<String, String>.fromEntries(toDart
        .map((e) => List<String>.from(e))
        .map((e) => MapEntry(e.first, e.last)));
  }
mkustermann commented 3 weeks ago

@mrtnetwork Would you be willing to file a new issue with a small standalone reproduction (complete example & build command)? (We have repurposed this bug for a discussion about hash codes of interop types)

mrtnetwork commented 3 weeks ago

@mrtnetwork Would you be willing to file a new issue with a small standalone reproduction (complete example & build command)? (We have repurposed this bug for a discussion about hash codes of interop types)

Sorry, I used the wrong compile command. It seems that the issue is related to Flutter’s Dart compilation. I've opened an issue in the Flutter repository.