mraleph / irhydra

Tool for displaying IR used by V8 and Dart VM optimizing compilers
Apache License 2.0
433 stars 32 forks source link

How are blocks marked as unreachable #50

Closed tusharmath closed 7 years ago

tusharmath commented 7 years ago

Can you please help in understanding why a set of blocks were marked as unreachable? Here is an example —

run()

screen shot 2016-10-01 at 3 59 32 pm

Archive.zip

mraleph commented 7 years ago

I looked at the IR for B22. It contains nothing but deoptimization, that's why it is marked as dead block. You can see the logic for marking here.

mraleph commented 7 years ago

Essentially greyed out blocks mean "don't look at this blocks, they are not interesting (i.e. can't be reached or don't contain anything interesting)"

tusharmath commented 7 years ago

The opacity i think mislead me and I am new to vm optimizations so please forgive me for being ignorant or rather dumb.

Below are IRs for two implementations of the subscribe function. The fast one is 4.8 times faster than the slower one but it is also marked with the red dotted border.

slow.zip

    for (var i = 0; i < this.array.length; ++i) {
      observer.next(this.array[i])
    }
    observer.complete()
    return subscription

fast.zip wrapping complete with an end function makes it faster?

for (var i = 0; i < this.array.length; ++i) {
      observer.next(this.array[i])
    }
    end()
    function end () {
      observer.complete()
    }

    return subscription

I have spent three days trying to figure out the why of it. Can you please help me?

mraleph commented 7 years ago

@tusharmath do you have a benchmark I could run locally? it would be easier for me to figure things if there was something like that.

tusharmath commented 7 years ago

I have pushed the code here — https://github.com/tusharmath/observable-air

To create files for IRHydra —

The two versions of code —

FAST https://github.com/tusharmath/observable-air/blob/master/src/sources/From.ts#L13

SLOW https://github.com/tusharmath/observable-air/blob/master/src/sources/From.ts#L27

mraleph commented 7 years ago

I checked this out. Instead of looking at subscribe function you should look at the benchmark function itself (filter by defer to find it in the list). If you follow the chain of inlined functions until you arrive to subscribe then you will discover that in the slow case From2Observable.subscribe itself is inlined into benchmark function. However small functions that perform filtering and operations are not inlined into it.

In the fast case FromObservable.subscribe is not inlined into the benchmark (and small functions are inlined into it).

This is where performance difference comes from: because the loop inside From2Observable.subscribe is the hottest loop in the benchmark the fact that in the slow case small functions are not inlined into it, because inliner runs out of depth budget, causes performance degradation.

Writing your code like this:

    subscribe(observer) {
        // ____     ___  ____   __________              _        ____   ___       ___
        // `MM'     `M' 6MMMMb\ `MMMMMMMMM             dM.      6MMMMb\ `MMb     dMM'
        //  MM       M 6M'    `  MM      \            ,MMb     6M'    `  MMM.   ,PMM
        //  MM       M MM        MM                   d'YM.    MM        M`Mb   d'MM
        //  MM       M YM.       MM    ,             ,P `Mb    YM.       M YM. ,P MM
        //  MM       M  YMMMMb   MMMMMMM             d'  YM.    YMMMMb   M `Mb d' MM
        //  MM       M      `Mb  MM    `            ,P   `Mb        `Mb  M  YM.P  MM
        //  MM       M       MM  MM                 d'    YM.        MM  M  `Mb'  MM
        //  YM       M       MM  MM                ,MMMMMMMMb        MM  M   YP   MM
        //   8b     d8 L    ,M9  MM      /         d'      YM. L    ,M9  M   `'   MM
        //    YMMMMM9  MYMMMM9  _MMMMMMMMM       _dM_     _dMM_MYMMMM9  _M_      _MM_
        //
        for (var i = 0; i < this.array.length; ++i) {
          observer.next(this.array[i]);
        }
        observer.complete();
        return subscription;
    }

would also make it faster (on the current node) because V8 would refuse to inline "large" function subscribe.

However this case is not representative: in the real world it's unlikely that code will be this monomorphic. As soon as you start benchmarking more realistic code, e.g. code where subscribe is not monomorphic with respect to observer you will discover that the slower case is more realistic then the faster one.

mraleph commented 7 years ago

20161002212639

tusharmath commented 7 years ago

Perfect! Thank you so much @mraleph.

But how did you know that I should be searching for the defer function, there are so many?

(filter by defer to find it in the list)

mraleph commented 7 years ago

You can use undocumented trick: use src:from2 in the filter. This filters method list to only include those that contain from2 in their sources.

tusharmath commented 7 years ago

@mraleph

small functions are not inlined into it, because inliner runs out of depth budget

  1. How would one figure out that the function wasn't inlined because of the depth budget?
  2. Can I view the depth in IRHYDRA?
  3. What could be other reasons for the function not getting inlined (or inlined)?
mraleph commented 7 years ago

How would one figure out that the function wasn't inlined because of the depth budget?

Well, I know that inlining depth is 5 - so it's easy to see - cause function looks otherwise inlinable :)

You can use --trace-inlining to trace inlining decisions - what was inlined and what was not - but then you have to read output yourself. IRHydra does not parse that for you.

Can I view the depth in IRHYDRA?

If you look at the animation above you can see inlining path suite.add.defer > subscribe > subscribe > subscribe > subscribe > next each > represents one level of inlining, which for example means that next was inlined at depth 5 - which is the limit (controlled by flag --max_inlining_levels.

What could be other reasons for the function not getting inlined (or inlined)?

There are plenty. Can non-optimizable by Crankshaft (e.g. uses some unsupported ES6 construct). Can have source that is too big. Can be too big in terms of IR instructions, etc.

tusharmath commented 7 years ago

Thanks