Object.hashCode is slow?

dart-lang / sdk

The Dart SDK, including the VM, JS and Wasm compilers, analysis, core libraries, and more.

https://dart.dev

BSD 3-Clause "New" or "Revised" License

10.24k stars 1.57k forks source link

Object.hashCode is slow? #24206

Open asgerf opened 9 years ago

asgerf commented 9 years ago

Overriding Object.hashCode with a simple integer counter can provide shocking performance improvements. I.e.

class Foo {
  static int _hashCounter = 1;
  final int hashCode = ++_hashCounter; // Better than Object.hashCode
}

It seems the VM uses a random-number generator to create fresh hash codes, and then stores them in a weak hash table to associate it with the object. Profiling the dart2js compiler seemed to indicate that Object_getHash (which performs the lookup, not the computation) is a bottleneck.

Maybe the hash code could be stored as a field? There is something ironic about storing hash codes in a hash table.

I don't have a minimal reproduction case, but you can reproduce by running revision 90039f64ae of the dart2js compiler with the --cps-ir flag on the ImagingGaussianBlurOnce benchmark:

git checkout 90039f64ae
time dart -pout/ReleaseIA32/packages pkg/compiler/lib/src/dart2js.dart GOLEM/benchmarks/ImagingGaussianBlurOnce/dart/ImagingGaussianBlurOnce.dart --use-cps-ir

Then apply https://codereview.chromium.org/1311673009 to see the difference.

kodandersson commented 9 years ago

Regarding the current design: Since most objects will not have their hash code computed, dedicating a word in all of them was deemed wasteful. The default implementation of Object.hashCode uses an RNG rather than object addresses to allow for the GC relocating objects, although currently, we only relocate objects in the young generation.

munificent commented 9 years ago

For what it's worth, this was a noticeable performance issue in dartfmt, that I had to optimize by having my own hash code base class.

iposva-google commented 9 years ago

To give you an idea of the overhead Daniel mentioned in his comment I ran dart2js (the default version) compiling itself and collected the allocation information. From that I selected the top-ten object allocations. Based on the average object size I can give you a percentage overhead of adding a field to each of these objects. None of the objects in this list would benefit from an extra hash code field, but since they are all instances you suggest they have one. The size is the cumulative allocated size, to get an idea on how much more memory we would have to allocate and touch.

LinkEntry: +25% on top of 3.0GB List (fixed length): +10% on top of 1.7 GB Closures: +25% on top of 877.7+MB (I did not collect cumulative size of all closures, just the top 10) List (growable): +25% on top of 447.0MB Uint32Array: +7% on top of 369.0MB CompactIterator: +10% on top of 234.7MB ListIterator: +17% on top of 224.9MB CompactIterable: +13% on top of 157.2MB InternalLinkedHashMap: +13% on top of 117.7MB UnionTypeMask: +50% on top of 94.1MB

Given these numbers of overhead it should be pretty clear that adding an extra field for a hash in all objects adds significant memory overhead, reduces cache efficiency, adds runtime cost for allocation, adds extra GC pressure and more.

For the comparatively few objects that need a hash value, you can either cache the value in an extra field as you suggest or rely on the address based lookup of the hash value. Since usually the objects you need to put into identity maps are under your control, I suggest that you add the field into those. Then you get the best of both worlds, low overhead and fast hash access.

Regarding the comment about the use of a random number: This gives you a much better spread of your hash values than if you use addresses of objects as they tend to have alignment issues.

P.S. As Bob suggests, you should cap the counter value so as not to grow into arbitrary large integers or use a random number with sufficient bits.

asgerf commented 9 years ago

I see, yeah a field is not going to work.

Could it be that the VM's hash table has a scalability issue, though? It seems that Object.hashCode is pretty fast until a critical point is reached at which point is becomes REALLY slow.

Please look at this: https://docs.google.com/a/google.com/spreadsheets/d/1u2D-b6d6EwCsRZUGRcaBbDY74iafKI_6fAbqWYWbwTE/edit?usp=sharing

I'm fine with using the manual fix myself. Right now I'm just concerned about the unpredictable performance. It seems like one of those lurking problems that most people have to learn the hard way (e.g. because their web server suddenly becomes unresponsive).

iposva-google commented 9 years ago

Thank you for your concern. I just reran your example benchmark, with fixes which factor out process startup, allocation and compilation artifacts:

class Foo {
}

int test(int count) {
  var list = new List(count);
  var sum = 0;
  var c1000 = count / 1000.0;
  var foo;
  var sw = new Stopwatch();
  sw.start();
  for (var i = 0; i < count; i++) {
    list[i] = new Foo();
  }
  sw.stop();
  var lap0 = sw.elapsedMicroseconds / c1000;
  sw..reset()..start();
  for (var i = 0; i < count; i++) {
    sum ^= foo.hashCode;  // Ensure to not ever overflow into Mint or BigInt.
  }
  sw.stop();
  var lap1 = sw.elapsedMicroseconds / c1000;
  print("$count, ${lap0.toStringAsFixed(3)}, ${lap1.toStringAsFixed(3)}");
  return sum;
}

warmup() {
  print("** Warmump **");
  test(1000);
  test(10000);
}

main(List<String> args) {
  var arg1 = (args.length > 1) ? int.parse(args[0]) : 10000;
  var arg2 = (args.length > 2) ? int.parse(args[1]) : 300000;
  var arg3 = (args.length > 3) ? int.parse(args[2]) : 10000;
  warmup();
  print("** Test **");
  for (var c = arg1; c <= arg2; c += arg3) {
    test(c);
  }
}

Once these measurement errors have been factored out we can see that the time in microseconds taken per 1000 calls to foo.hashCode is remarkably flat, while the time taken to allocate 1000 Foo objects obviously fluctuates depending on whether we hit a GC in one of the allocation loops:

ReleaseX64/dart ~/dart/Bug24206.dart
** Warmump **
1000, 102.000, 85.000
10000, 49.600, 58.200
** Test **
10000, 8.000, 0.600
20000, 7.900, 0.550
30000, 7.367, 0.533
40000, 21.575, 0.475
50000, 9.700, 0.600
60000, 28.500, 0.550
70000, 66.014, 0.457
80000, 8.450, 0.525
90000, 8.956, 0.511
100000, 8.200, 0.520
110000, 38.900, 0.509
120000, 8.075, 0.717
130000, 8.454, 0.515
140000, 9.164, 0.514
150000, 8.513, 0.513
160000, 8.088, 0.519
170000, 8.535, 0.518
180000, 9.561, 0.517
190000, 9.826, 0.516
200000, 34.660, 0.515
210000, 10.610, 0.790
220000, 8.077, 0.459
230000, 8.778, 0.722
240000, 10.421, 0.650
250000, 47.688, 0.652
260000, 4.665, 0.631
270000, 4.289, 0.596
280000, 3.789, 0.475
290000, 3.931, 0.452
300000, 11.657, 0.513

I hope this also addresses your concern about unpredictable performance.

asgerf commented 9 years ago

"foo" is never assigned so it's just computing null.hashCode all the time. After changing it to list[i].hashCode the numbers in the second column get a lot bigger and also fluctuate.

Outside of the GC noise, there does not seem to be an upward-going trend, though, but the spikes themselves are likely due to the hash table since they go away when hashCode is overridden.

I noticed that 32-bit dart runs the benchmark a lot more smoothly. Don't know what to make of that, maybe it's just faster.

Here is a run from ReleaseX64
** Warmump **
1000, 406.000, 1998.000
10000, 87.100, 1621.800
** Test **
10000, 8.500, 169.400
20000, 8.000, 3670.400
30000, 7.133, 6722.400
40000, 32.375, 123.650
50000, 12.260, 130.960
60000, 215.483, 123.050
70000, 22689.400, 117.071
80000, 14.887, 116.862
90000, 13.800, 38990.778
100000, 11.480, 120.730
110000, 4237.600, 1237.027
120000, 11.717, 119.658
130000, 11.031, 125.138
140000, 10.664, 118.764
150000, 10.240, 123.527
160000, 10.031, 54502.912
170000, 9.724, 47196.353
180000, 6.489, 117.511
190000, 7.605, 121.763
200000, 78.675, 116.995
210000, 7.510, 119.119
220000, 9.391, 117.682
230000, 9.187, 119.209
240000, 9.092, 118.933
250000, 8.904, 116.860
260000, 3587.719, 116.077
270000, 6.344, 134.359
280000, 6.761, 122.357
290000, 14.097, 136.952
300000, 5.943, 119.670

iposva-google commented 9 years ago

Thanks! Another reason not to be writing code at midnight. :-(

I'll investigate where the occasional spike comes from.

mkustermann commented 6 years ago

We store identity hash codes on 64-bit platforms in the object header (by using free bits in the header). On 32-bit platforms we don't have the space in the header word and therefore call out to runtime (Object_getHashCode) which looks the identity hashcode up in a weak map (this runtime call is quite expensive).

Possible solutions for making identity hash codes faster on 32-bit platforms include:

Add an identith hashcode field to all instances: Downside is it will cause significant memory increase (as mentioned in the comments above)
Intrinsify WeakTable::GetValue / WeakTable::SetValue in Object_getHash lookup code to avoid going to runtime (only go to runtime for growing/re-hashing the table): Causes duplicate between c++ and intrinsic code and will still be slower than having it in a field.
Use a bit in the object header to signal if the address of the object is the identity hashcode (or used as a base for it) and if the bit is false, look at the word after the object: This would require significant work (GC would make objects bigger when moving, since they lazily get appended an identity-hashcode field) and is less than ideal since it creases complexity in the VM.

If it's really important to solve it, I would suggest doing 2. from above.

/cc @mraleph @ErikCorryGoogle

a-siva commented 6 years ago

The canonical const objects table currently uses the object header and this has been a consistent problem for us

one aspect that introduces non determinism in our builds see #31427
requires rehashing after initialization from a snapshot or after compaction

I think we should re evaluate the memory overhead of adding a hash value per object on 32 bit architectures and see if it makes it easier for us. We already have hash as a field in RawString, RawClosure, RawTypeArguments, RawType, RawTypeParameter, RawBoundedType etc.

icatalud commented 5 years ago

@asgerf, @mkustermann What is the update on this issue? I just runned into this issue and posted it in stackOverflow. This is a big issue because it is non intuitive and unexpected from the user perspective. A 20% memory allocation increase is "acceptable", a program taking 10 times longer when switching from Set<int> to Set<Object> is not acceptable, specially because the developer has no clue that the cost of not having a primitive type Set is so expensive. Knowing this beforehand would definetily have an impact in the way the developers design their software when using Dart.

I learned this the hard way, looking everywhere in my code for some kind of memory leak that I couldn't find. Lost a whole day to run into the conclusion that it was my design "improvement" (reducing the number of sets by joining values in one class) what was striking the performance so badly.

Even though on average it was better to have an inefficient hashCode rather than increasing the memory allocation of objects, you have to considered expectations and variability. A 10x variability is unacceptable. Ensuring consistency has a cost, it has always been like that. The numbers shown by @iposva-google are not a reason strong enough.

If this extra memory allocation could be an issue, I suggest creating an alternative Object, perhaps PrimitiveObject, and Object extends PrimitiveObject with a fast hashCode implementation.

Finally, why does it has to be so slow? Why is not the hashCode simply the memory allocation pointer, which must be unique for an Object?

mraleph commented 5 years ago

@icatalud Unless you were benchmarking performance of ARM32 or X86, I suspect that you were most likely bitten by a regression that was introduced somewhere in June (shipped with 2.5) and was recently fixed on master branch by 60328ca85868d7bda9185849cbb71311680a10d2.

When I run your benchmark on X64 Dart VM built at HEAD I get the following numbers:

Running app
--- Set benchmark ---
Warm app int Set average time: 11.637425811707203 us
Add Object in Set average time: 14.637567241190032 us
Int Set average time: 15.91362120958951 us
String Set average time: 22.9935388188225 us
Dynamic Set average time: 17.738263962182153 us
Object Set but value int average time: 20.344987538782362 us

Set summary:
Object/int time ratio: 0.9198137273978515
Object/String time ratio: 0.6365947998055752
Object/dynamic time ratio: 0.8251972838152154
Object/int(Object typed) time ratio: 0.7194679875466802

--- Map benchmark ---
Warm app int Map average time: 15.471756351146455 us
Add Object in Map average time: 22.106430718896455 us
Int Map average time: 22.935197183583135 us
String Map average time: 32.17157012562935 us
Dynamic Map average time: 28.252020002260142 us
Object Map but value int average time: 29.398327257761053 us

Map summary:
Object/int time ratio: 0.9638648642061859
Object/String time ratio: 0.6871418035418003
Object/dynamic time ratio: 0.7824725707092078
Object/int(Object typed) time ratio: 0.7519621958443379
-----------------------------------------------------
-------------END------------------------------------
-----------------------------------------------------

which seem fine to me.

On 64-bit architectures (ARM64 and X64) we in fact store identity hash code in the object header (because there is some free space there).

Why is not the hashCode simply the memory allocation pointer, which must be unique for an Object?

Objects can move around in memory (Dart has a moving garbage collector which moves objects when promoting from young to old generation or when compacting the heap) so pointer can change during the execution.

There is a way to make this work (if an object had it's identity hash code used you can enlarge it by one word when relocating) but it is rather complex and give that 32-bit architectures might be going out of focus it is not high priority for us to implement it.

icatalud commented 5 years ago

@mraleph Switching to master and upgrading solved the issue as you pointed it out. I'm running Dart on a 64 bit machine, so I'm assuming Dart is executing assuming this 64 bit architecture. When running on stable however I get the following results:

Running app
--- Set benchmark ---
Warm app int Set average time: 11.044270804572312 us
Add Object in Set average time: 197.15485460818138 us
Int Set average time: 16.020401951282032 us
String Set average time: 26.158256820738185 us
Dynamic Set average time: 182.38646726244755 us
Object Set but value int average time: 20.39274024980882 us

Set summary:
Object/int time ratio: 12.306486142340772
Object/String time ratio: 7.53700278880501
Object/dynamic time ratio: 1.0809730434905713
Object/int(Object typed) time ratio: 9.667894171801148

I can assure you (from my own experience) that this could be producing serious confusion in other developers that are clueless about big performance drops they might be having. This is currently happening on stable.

My understanding about hashCode and equality is the following:

Default implementation of Object equality simply compares the pointers for equality.
Default implementation of hashCode assigns a random number and stores it in a pointer to hashCode HashTable. This choice was made because it is understood that most objects will never use the hashCode getter, thus not assigning a hashCode saves a rough average of 25% memory allocation space according to @iposva-google. This choice was reverted in a fix on master since now hashCode getter works fast. This choice is a must, because a 10x variability is not acceptable in the utilization of Set and Map which should be common.

There is a way to make this work (if an object had it's identity hash code used you can enlarge it by one word when relocating) but it is rather complex and give that 32-bit architectures might be going out of focus it is not high priority for us to implement it.

I do not think it is conscious with Dart users to take this issues so lightly. Consistency is an important aspect of every programming language. If you cannot include a specialized solution, this should at least be clearly informed. I already explained the cost that this had in my case. You cannot leave 32 bit behind with that heavy cost without a warning. It would imply some programs running 10 times faster in 64 bit machines. If so, 32 bit machines should simply not be supported or clearly point out the drawbacks with warnings, or discouraging the utilization of Dart in 32 bit machines. This is important for developers to know when making a choice of a programming language. Imagine you are someone that chooses Dart as the language to code a solution for 32 bits machines, when by choosing a different language, you could have 10x improvement. Even python or JS perform better on this case, as an example in python this runs in 100 us, faster than Dart. It can be easily solved, but why would you know about this? If it is complex, consider then creating an alternative native Object32, which is automatically used when running in 32bit machines. This Object could have extra field and take more memory, is fine as long as it is transparent. This way the behavior of Dart is consistent across platforms.

Objects can move around in memory (Dart has a moving garbage collector which moves objects when promoting from young to old generation or when compacting the heap) so pointer can change during the execution.

Even if the pointer moves around, the hashCode could be stored the first time with the current pointer value and I would expect collisions to be sparse. Otherwise the hashCode could be created from a global id, like cachedHashCode??=id++ which has the added advantage of ensuring uniqueness. Is there a vulnerability reason why it would be better to generate a random hashCode?

mraleph commented 5 years ago

This choice was reverted in a fix on master since now hashCode getter works fast.

No, this is not a correct understanding. On X64 we freed some space (32-bits to be precise) in the object header so that we could store hash code in an object without increasing object size. This happened several years ago. However in Dart 2.5 we had a regression - we accidentally disabled fast way to access this hash code storage - which is what you observed on your benchmarks.

On 32-bit we don't have a luxury of having a free space in an object to put hash code into - so we opted for now to leave this choice in the developer's hands. Developer can always override get hashCode and institute their own caching if needed. The core reasoning here is that most of the objects will never have Object.get hashCode called on them - so adding a field to all objects is a waste of memory (which is a precious commodity and also indirectly impacts performance via GC overhead).

As you can see we are keeping this bug open in a hope that one day we will also improve Object.get hashCode performance (there are well understood ways to do that) - but currently it is not high on our list of priorities, given that easy workarounds are available.

Even if the pointer moves around, the hashCode could be stored the first time with the current pointer value and I would expect collisions to be sparse.

To store it you need to have some space in an object - which is what we want to avoid (on 32-bit platforms).

icatalud commented 5 years ago

However in Dart 2.5 we had a regression - we accidentally disabled fast way to access this hash code storage - which is what you observed on your benchmarks.

This is still an issue on stable 64 bit, which is what most people probably use. Shouldn't it get hot fixed or something?

but currently it is not high on our list of priorities, given that easy workarounds are available.

The problem is not that there is an easy workaround, the problem is that this is unknown, there is no warning about it. Put yourself in the feet of a developer that decided to port his Python machine learning code (or anything else you can think of that requires lots of hash operations) into Dart language. After a lot of work, to their disappointment the Dart code is slower than the Python code. They have no clue why. Sure there is an easy fix, but this is unknown and intuitive, what would drive them to think that it is the Hash data-structures they are using what is making their program under perform?

The core reasoning here is that most of the objects will never have Object.get hashCode called on them - so adding a field to all objects is a waste of memory (which is a precious commodity and also indirectly impacts performance via GC overhead).

The cost of variability is not being leveraged correctly. Making things work more efficiently on most cases might be slightly better on the average, but the cost for the people that loses is too high. If you suffered this yourself, you would think differently, the problem is that you already know about it, so you it's natural to take this into account. Like I said, increasing the memory utilization by 20% (I suspect that in most programs it would be much less than that) is something acceptable, but dropping the performance by an order of magnitude is not acceptable. The default use of objects in hash data-structures should be efficient, because their utilization is common in any programming language. As precious as it might be, consistency is even more precious than memory.

There is always a sacrifice of average outcomes for the sake of consistency, this rule applies to all human systems (from computer programs to the free market and the law). Taking care about consistency is having respect for humans and not treating them like numbers, it means understanding that it’s not fair that some people randomly get the bad stuff from the box surprises, because we wouldn’t want to be those ones.

On 32-bit we don't have a luxury of having a free space in an object to put hash code into

It's completely acceptable to inform that Dart is designed to be more efficient in 64 bit machines, and for now pay the cost of taking extra memory in 32 machines by adopting simple solution of having an extra field in 32 bit Objects. It is even possible to still give the option of creating lighter objects, by having two classes Object and PrimitiveObject, but the default Object that most people is using should be the one that is consistent. If someone was optimizing for memory, they could switch to the lighter PrimitiveObject. I believe that people for whom the extra 20% memory is relevant is a rarer case than accessing the Object hashCode. Even some basic data structures like List or Maps could extend this PrimitiveObject.

Update: I deleted comments that didn't add content to the issue. The possible solution I posted was already specified here, and the question before was off topic as pointed out below.

mraleph commented 5 years ago

What if an object X is moved and it was being referenced from many other objects?

Then you need to update references from other objects to this object. You know all of these references anyway when doing precise garbage collection. This is classical garbage collection stuff (see wikipedia for example or a more specialized literature if you have access to it). In any case this is off topic to this bug.

munificent commented 5 years ago

This is still an issue on stable 64 bit, which is what most people probably use. Shouldn't it get hot fixed or something?

I think most Dart programs are not at all performance bound by Object.hashCode performance so are unlikely to be significantly affected by the regression. It will be fixed in the next stable release, and in the next dev release. We generally don't put out stable hotfix releases unless there are significant issues like security problems, crashes, etc.

I believe that people for whom the extra 20% memory is relevant is a rarer case than accessing the Object hashCode.

Making every object larger affects the performance for all Dart users. Increasing object size reduces the number of objects which fit in cache, which leads to more cache misses, which in turn harms performance. Making objects smaller at the expense of a slower Object.hashCode only affects the smaller number of users who use that method.

icatalud commented 5 years ago

I think most Dart programs are not at all performance bound by Object.hashCode performance so are unlikely to be significantly affected by the regression.

That comment completely dismisses my previous comment about taking into account everyone and not just averages (or "most of"), specially in what regards high variability.

You are right about the 99% of the cases not being affected, because even if a software works 10 times slower, it still does what it has to do fast enough (slow languages perform well in most uses cases), but you have no idea how much time this is costing to a 0.1% of the cases. I actually stumbled upon this problem one month ago, when just by chance I was unsatisfied with the performance of a library I have been working on. I changed the way I was registering objects by giving them an int id and manipulating everything internally with this id. The benchmarks improved the performance by 4x, I thought wow my new design is brilliant, I'm taking advantage of int Sets that have some special implementation and are super fast. I could believe this was the case thinking maybe it had to do with using contiguous numbers and the Set<int> was optimized to use a list in some cases. But that was just one possibility, I really had no clue. I changed my mind later, and realized the id was completely useless in terms of architecture, I was just adding an extra value, however it was impossible for me to make the simpler code work fast. Finally this weekend I reached the most minimalist design I could get to. Surely it had to be faster, but it was 5 to 7 times slower. And that is when I decided to benchmark Set and figure out what was going on.

You not are measuring the cost of inconsistency (variability) fairly, you cannot be happy because most people (do you have numbers to support that claim anyways?) is fine by leaving a small percentage in the trash. I told you my story, but there must be other developers that suffered this.

Making every object larger affects the performance for all Dart users.

Almost any developer would like to know that their programs are running according to the complexity bounds related to data structures they are using in a consistent manner every time and they don't have to wonder if they did something that is leaving their program in the 1% case where it is working slower than Python. This applies to 32 bit Dart developers too. Once again, consistency is key. If 32 bit programs have to run 5% to 10% slower on average but can you can ensure that my program is always going to perform according to what is expected and I don't have to worry about details, then I accept it.

Making objects smaller at the expense of a slower Object.hashCode only affects the smaller number of users who use that method.

If talking about quantity of users or average use case, that statement lacks the numbers to support the amount of impact this has. The 20% to 50% memory increase is on the extreme cases of benchmarks using minimal native data structures. I would argue that most users create classes and the classes have plenty of fields inside, but I do not have numbers to support that claim. I would also argue that many users (if not most) use the Map (json) or Set data structures and they put objects inside. In those cases the performance gain might be outweighing the performance cost caused by the slight memory usage increase.

In conclusion:

64bit expensive hashCode: the expensive hashCode is certainly a bug that is causing big performance drop to some dart programs using the stable version of Dart and that are currently in production (I presume this is costly in most http servers and clients that manipulate jsons).
32bit expensive hashCode:
- There are no numbers to support claims about average performance drops or gains in dart programs, except some specialized benchmarks that go in favor of one side or the other.
- There is a good justification about adding a hash field to Object 32bit which is having software behave consistently within the expectations (something that is key to any programming language).
- Any performance cost of fixing this consistency issue cannot go beyond a 50% memory increase in the most extreme case (upper bound).
- The performance cost of the odd hashCode implementation is 1000% (one thousand).
- The cost can be avoided by overriding the getter, but Dart developers are clueless about this.
- Having a getter that is so slow is misleading. It should instead be a method computeHashCode()
- If dart32 hashCode is behaving oddly, it should be clearly alerted to the developers of dart32, that they should override the hashCode getter in classes that they pretend to use in Hash data structures.
- Even if people knew about this and override hashCode, it doesn't fix every case, because sometimes native data structures are put into Hash data structures.
- Making 32bit a bit more underperformant for the sake of consistency is acceptable and also positive because it ensures that code in 64 bit machines is not going to run ten times slower in 32bit for unknown reasons.
- If Dart is focused on 64bit, it should be ok to not be so "optimal" in 32 bit ,

We generally don't put out stable hotfix releases unless there are significant issues like security problems, crashes, etc.

According to what happened to me and might be happening to others, it is inconsiderate to them not finding this a significant issue.

icatalud commented 5 years ago

Having a second thought today about this issue, I realized that no one has explained with clear rationally what is going on. Theoretically the hashtable solution that is implemented shouldn't be as bad as it is according to the benchmarks. Whenever the hashCode is accessed is probably to use it within a hash datastructure, therefore getting the hashCode from a hashTable is like duplicating the hash data structures effort. With that logic the additional time it takes shouldn't go beyond 2x or 3x (3x because of the extra overhead). So why does it reach 10x? I thought the benchmark was not representative because it was adding one value, but I attempted the same with the set filled with random values and the result is similar. I thought that maybe the hashCode getter is being used more than one time, but it is not the case.

In summary, there is something odd with the implementation, performance drop shouldn't be as bad.

Aside from that comment I think I found a way to convince you @munificent and @mraleph that the current implementation is wrong (at least until it is justified to be beneficial). Imagine it was the other way around, the hashCode was stored within the object. If someone came and told you: "I have an idea, we can optimize the dart execution by reducing the amount of memory objects take. The hashCode is taking one memory slot in every object, but the hashCode is hardly ever used. What we could do is remove the hashCode variable from the objects and only store the hashCodes of the objects that actually use the hashCode in a secondary hash table. This will have a cost in the time it takes to retrieve the hashCode, since it has to be retrieved from a map (hash table), but overall since hashCode is not used very much in dart programs, it is possible to save up to 33% memory. On average I expect a reduction of 5% to 10% memory usage." Would you accept that change just like that? Do you think the Dart team would approve it?

Certainly not, there are quite some reasons that go against that proposition. Starting from the expectation that developers have about the language that using a getter is almost like directly retrieving a value, to the actual impact this could have, since retrieving the value from a hash table implies at least duplicating the cost of hash data structures retrieval operations, which makes it wonder if it is actually a good idea to have a hashCode getter at all. Perhaps it would be less misleading to remove the getter if it is hardly ever used, and force the users to create the getter if they need it in hashable Objects, or extend a HashableObject class instead of Object. Even then, it could and should be considered, but then the claims someone is doing would have to be proven, numbers would have to be run to show that the change is actually doing more benefit than harm and that the variability it is creating is constrained to reasonable numbers. Then perhaps the team would be convinced that it is good idea to do that.

The case with the hashCode implementation is that it was implemented without thoroughly justifying it, it is exactly the opposite of what I explained above. Currently you are in the position where all the checks were skipped and things are upside down. Everyone expects and thinks that hashCode is just a value stored in the object, but reality is that it is working differently. Somehow this has been accepted as normal and no one has a clue about how much negative impact this could be having.

There is only one reason that goes in favor of that implementation, a possible performance gain, and that is only a maybe, because no one knows that for sure. But there is a list of reasons that go against it (I listed them in the previous comment).

We generally don't put out stable hotfix releases unless there are significant issues like security problems, crashes, etc.

About that comment I thought another way to put it. Imagine exactly the same problem happened after a C++ update. Some programs would run 10 times slower. Can you imagine how significant that would be, or the impact it would have on google servers? It could be a disaster and in fact it would be better that the latest update caused the programs to crash, because then the people would switch back immediately to the previous version of the compiler. But if the programs are unexpectedly running 10 times slower, unless the users had a performance benchmark, they would be naively thinking that their software is in a better spot with the new stable version of the compiler. In the case of Dart, because it is not so widely used, these bugs go more unnoticed, but you should have the same carefulness that you would be having when updating the C++ compiler. Remember that when one person posts about an issue it usually means it happened to 10 other people, because only a small percentage of people post issues. About the performance story I told you, here is the link to the code I "optimized" by luck when using an int id instead of objects.

One of the general problems I see whenever I post something is that I read a lot of reasons that are based on arbitrary statements. Arbitrary statements are dangerous, because they make us feel that we are right about choices we make, when there is an open possibility that the statement itself is completely wrong. Decisions should be taken based on rational objective facts (within the possibilities to acquire them).

munificent commented 5 years ago

Would you accept that change just like that? Do you think the Dart team would approve it?

Yes, because that is what happened, at least as I understand it. Many of the original Dart team members came from V8. My understanding is that V8 does pay the memory and performance cost for storing the hash code in every object. They didn't feel that was a good optimization, but couldn't remove it because it would regress some benchmarks, even if it made most JS code faster on average.

When they created Dart, they deliberately avoided repeating what they considered to be a mistake.

Starting from the expectation that developers have about the language that using a getter is almost like directly retrieving a value,

That expectation doesn't exist in Dart. Iterable.length can take infinite time:

Iterable<String> get forever sync* { while (true) yield "forever"; }

main() {
  print(forever.length);
}

makes it wonder if it is actually a good idea to have a hashCode getter at all. Perhaps it would be less misleading to remove the getter if it is hardly ever used, and force the users to create the getter if they need it in hashable Objects, or extend a HashableObject class instead of Object.

That was also part of the initial design of Dart. Originally, Object had no hashCode member specifically to avoid this issue. The key problem with that is that if a class you don't control happens to not make itself hashable, you have no way to store it in a Map or Set. That turned out to be a very painful limitation in practice, so with some reluctance the team made all objects hashable. The trade-off was, as you see here, that it's not the most efficient possible implementation because they didn't want to pay a memory and speed price for the majority of objects which are never used in hashes.

Basically, you can have three things:

Better memory use and speed for objects that are not hashed.
A fast Object.hashCode.
Support for all objects as hash keys.

But you only get to pick two. Dart picks 1 and 3. When 2 is a problem, it's easy to implement hashCode in your own class to fix it. If Dart picked 2 and 3, like you suggest, then there is no way for an end user to solve the problem introduced by not having 1.

icatalud commented 5 years ago

Yes, because that is what happened, at least as I understand it

It surprises me that it happened like that. Even then, you shouldn't accept it as ok because that's how they did it, you should think whether or not that's rationally a good way to make decisions and whether you think things should be done like that.

They didn't feel that was a good optimization

Decisions shouldn't be based on feelings, should be based on facts. There is a "pressure" for Dart to be more efficient than nodejs, proving that is worth changing, therefore feelings are likely to be aligned with the idea of stretch reasoning in favor of conceding reliability in favor of performance (it's not necessarily like that, but these kind of things hide behind feelings). I think you shouldn't compare to anything but to Dart itself and just make a solid reliable language in all aspects and once you have that, you can optimize the vm as long as the behavior stays consistent. I believe the granular performance is hardly ever taken into account, as prove of that claim there are tons of python servers which probably run more that 5 times slower than nodejs. What that means is that most people are not choosing a language based on performance, that's only one more of the factors that are taken into consideration.

That expectation doesn't exist in Dart. Iterable.length can take infinite time

You are right that the expectation doesn't exist in Dart as it's specified in the effective dart design section (I just read it). It should be mentioned in the tour. In the design section they do mention that most languages expects getters to make minimal fast operations. If new developers come, they are probably going to have that expectation, as I expected it intuitively at least in what regards native getters.

When 2 is a problem, it's easy to implement hashCode in your own class to fix it

But it would need to be clearly informed, is that what the casual Dart developer is going to do? The general problem I'm seeing, is that you are innovating in a language's behavior, which is fair as long as it is clearly pointed out. But it's not, I had no clue about that, it is not specified in the Object hashCode docs either (I discovered it through benchmarking), and it is not consistent, depending on the platform it has a variability of x10. Sometimes it is good idea to override, sometimes it's not. These innovations are debatable, personally I think it's better to relieve the developers from having to take into account more concepts and make things smooth, work as they are expected to work in all cases. This usually means that programs are going to be slightly more inefficient, but that's the cost of layers of abstraction. A garbage collecting language is expected to have an overhead, but it releases the developer from the burden of having to take into account more things when developing, ideally allowing them to focus on the general ideas of the problem they are solving.

The key problem with that is that if a class you don't control happens to not make itself hashable, you have no way to store it in a Map or Set.

Exactly because the uses cases are unpredictable is why I think it's better to point for consistency. If it's truly so important to save the space, perhaps a better solution was to automatically wrap the Object with another proxy Object that has a hashCode whenever the getter is used. I'm not aware about the internals though, I will ask on stackoverflow about that.

Basically, you can have three things:

That's a good segmentation of the trade offs, however when deciding which options to take, it's not just about picking randomly. Every one of those points should be profoundly evaluated and think on the implications of them. About number 2, what are the implications for the developers? Will that make them more free? Does people coming from JS going to take that into account? What will happen if people forgets to override the hashcode, what will be the implications to their program? Is it intuitive? And so forth.

When making decisions a about a product, they shouldn't be based on feelings, they should be based on rational thoughts and facts. You should put yourself in the feet of the consumers, what is what they want and how do they behave. I was very happy with some things in Dart, but these kind of things make me feel completely insecure about it, I don't believe the language is reliable anymore, because under the hood there are unexpected design choices that can have huge impact in my program (this is not the first one, but it's the first where I find that my python program was faster). What else am I doing wrong that I haven't found out yet?

mraleph commented 5 years ago

I think this thread is not going anywhere, so I respectfully request to take this discussion somewhere else.

@icatalud we have heard your arguments, there is no need to repeat them. no amount of polemics on this issue is going to change the outcome or its prioritisation.

We recognise that Object.get hashCode implementation could be improved on 32-bit platforms and we are aware of the way to do that without increasing memory consumption, that is why we keep it open. However our resources are limited and we have a lot of things to work on. This particular issue is pretty far away from the top of the list - because we believe that it (1) rarely causes any significant performance issues in the real world applications (not microbenchmarks) and (2) when it causes performance issues they are easy to diagnose using profiler and work around.

If you have an example of an application where (1) and (2) are both violated - then we would be happy to take a look - and that (unlike the discussion we are having here) would in fact change how we look at this issue.

We would also be happy to take a patch addressing this issue on 32-bit platforms. If you feel like contributing we can explain what needs to happen.

icatalud commented 5 years ago

@mraleph Thank you for the update, I didn't know what was the status of this, no one said anything about leaving it open or addressing it. That's why I kept explaining my point and also because I found the given answers unsatisfying.

Keep in mind that this is open for 4 years, I thought you would just leave this open. I was trying to make the point that this is important, even if (1) and (2) were rare, because if they are happening in servers for example, it's most likely that people are not noticing. Usually performance is not so important, as python servers are good enough for most applications. I happened to measure this because in the library I write it is important to have good performance, but I believe it's more common that what you believe. Remember this problem is currently also happening in stable 64 bit.

Obviously this is important for me now and that's why I complain. I hope someone else raises the issue too. Dart doesn't have the developer mass of popular languages, making issues more unlikely to be discovered or pointed out.

If you feel like contributing we can explain what needs to happen.

I don't commit beforehand, but please explain what needs to happen to contribute.

mraleph commented 5 years ago

I don't commit beforehand, but please explain what needs to happen to contribute.

One possible way to fix this is on 32-bit is to use 2 bits in the object header to encode whether object has its Object.get hashCode called and whether it was moved afterwards i.e. object can be in three states {default, hashCode-used, hashCode-used-and-relocated}.

Object.get hashCode would check which state the object is in:

default - compute hash code based on the address and then mark object hashCode-used
hashCode-used - compute hash code based on the address
hashCode-used-and-relocated - load hash code from a field in the object

GC needs to be modified to each object it moves to check if it is in hashCode-used state and then grow it by one field, store hash code (computed based on the original address) in this field and mark this object hashCode-used-and-relocated.

icatalud commented 5 years ago

That's probably a more optimal solution than storing the hashCode in all objects.

hashCode-used - compute hash code based on the address

Why are both "default" and "hashCode-used" states necessary, couldn't they be the same default state? If it is default, the hashCode is computed from address, otherwise it has been relocated. Would that be enough?

Could you point out the key code pieces, so that I can have a head start?

mraleph commented 5 years ago

Why are both "default" and "hashCode-used" states necessary

Because you want to distinguish objects that have their hash code used for something and those that did not - you need this information during relocation to decide if you need to add a hash code field into object.

Could you point out the key code pieces, so that I can have a head start?

VM source is in runtime/vm. Object header tags are described in runtime/vm/raw_object.h. GC is in runtime/vm/heap.

icatalud commented 4 years ago

@asgerf are you a Dart developer? If you are not would you have any interest in collaborating with this issue?

I made important progress and basically what is missing is figuring out why using a default hash in the GetHash() method causes a failure. The hash field is available because I'm testing on 64bit architecture, but the logic is the same, so it should work on 32bit. This is the PR in case you wished to participate.

asgerf commented 4 years ago

@icatalud Sorry, I'm not using Dart these days and I'm afraid I wouldn't be of much use here anyway

icatalud commented 4 years ago

@asgerf wrote:

It seems like one of those lurking problems that most people have to learn the hard way (e.g. because their web server suddenly becomes unresponsive).

I cannot think how would you not be of much use here when it was you who detected the bug originally and all of your statements became true, I arrived here 4 years later exactly under the circumstances you mentioned.

Overall what is the insight you would give about this issue, what would be your preferred choice? Consider that Dart developers claim that there is a significant increase in memory usage when adding hash_ field in 32 bit architectures. I would say that a 20% impact is for specific native data structures benchmarks, but it should be far less in real applications. I'm implementing a rather complex solution, but it seems that the only way to currently provide consistency to 32 bit platforms. It is my opinion is that it should the other way around, Dart 32 should be by default consistent and the work I'm doing should be an optimization.

Consider this like a Dart VM c++ challenge, not a Dart language one. Your help would be much appreciated by dozens (hundreds?) of people regardless of you using Dart or not and your work will be documented (you become part of the language historical contributors). It's also an interesting learning experience to see how the internals of the VM work. At least I have learned a lot, thanks to it I have now designed my own ideal VM version (in my head not in reality). In case you wished to invest some time on this, it would be truly helpful.

icatalud commented 4 years ago

@kodandersson, @iposva-google, @lrhn , @mkustermann, @a-siva, @munificent,

I ping you because I embarked with Vyacheslav Egorov (@mraleph) into solving this issue for 32bit (the only current problem) using the third option described in this comment. The solution has been virtually ready for almost a month, but @mraleph has been away lately and the cadence for addressing updates has been very scarce during the last month. He pointed out that he is currently away in his last comment. I would really appreciate if any of you could help finish this or tell me straightaway if you are not interested in this implementation and then I can drop this issue for good (and relieve the responsibility).

If the solution is considered an improvement to the VM and you are interested:

I understand that replying takes time and committing to the issue even more. But if any of you could not interact with me but simply trigger the build from Gerrit (currently it builds only in my machine), that would be truly helpful. The way it would work is that whenever I update the code I'll ping you with "@your_name update please" at most once a day and that's it. Now if you are willing to comment, that's even better. According to the comments there is only a change in the snapshots missing and if someone can give a script or command to test that, that's very helpful. There is currently a build that has been waiting to be triggered in Gerrit for almost two weeks. If you are ok with triggering the builds but do not wish to complicate with any kind of explanations, simply comment "I can trigger the builds" and I'll start pinging you only with the message described.

This is the PR. Thank you.