HaxeFoundation / haxe

Haxe - The Cross-Platform Toolkit
https://haxe.org
6.15k stars 656 forks source link

Analyzer Fusion order of operations #10842

Open Simn opened 1 year ago

Simn commented 1 year ago

I noticed that the analyzer's fusion module struggles with a particular pattern:

@:analyzer(fusion_debug)
function main() {
    var a = getValue();
    var b = getValue();
    var c = [a, b];
    trace(c);
}

@:pure(false)
function getValue() {
    return 1;
}

Here's what happens:

  1. It attempts to fuse a, but fails due to the side effect on var b = getValue().
  2. It then attempts to fuses b, and succeeds.
  3. Because something has changed, the algorithm re-enters and then successfully fuses a.

While this doesn't look particularly bad at first glance, it scales really poorly. The problem is that every additional variable causes an additional pass to be run over the entire expression. This is facilitated by the transformer generating lots of temporary variables which are then fused together.

I'm thinking that forward searching is generally the right approach for this algorithm, but subsequent variables should first be "collected" and then processed in reverse order.

PXshadow commented 12 months ago

Here is a comparison for 1.11.0 and 1.13.0 the issue still persists, and luckily none of the other metrics seemed to have regressed. hl 1.11.0

name                               | time(s) |   % |  p% |       # | info
--------------------------------------------------------------------
analyzer                           |   4.041 |  28 |  28 | 1452474 | 
generate                           |   2.699 |  18 |  18 |   21204 | 
  hl                               |   2.699 |  18 | 100 |   21204 | 
    opt                            |   1.035 |   7 |  38 |   21202 | 
    write                          |   0.205 |   1 |   8 |       1 | 
typing                             |   2.273 |  16 |  16 |    5689 | 
macro                              |   2.047 |  14 |  14 |    7159 | 
  execution                        |   1.416 |  10 |  69 |    4753 | 
    toInterface                    |   0.647 |   4 |  46 |    1654 | stdgo.Go
    setRef                         |   0.322 |   2 |  23 |     701 | stdgo.Go
    asInterface                    |   0.205 |   1 |  14 |     463 | stdgo.Go
    cfor                           |   0.061 |   0 |   4 |     307 | stdgo.Go
    pointer                        |   0.055 |   0 |   4 |     751 | stdgo.Go
    str                            |   0.042 |   0 |   3 |     509 | stdgo.Go
    typeAssert                     |   0.034 |   0 |   2 |      64 | stdgo.Go
    controlFlow                    |   0.022 |   0 |   2 |      20 | stdgo.internal.Macro
    typeEquals                     |   0.005 |   0 |   0 |      43 | stdgo.Go
    copySlice                      |   0.003 |   0 |   0 |      61 | stdgo.Go
    less                           |   0.002 |   0 |   0 |      45 | stdgo.cmp._Cmp.Cmp_Fields_
    _parseRFC3339                  |   0.002 |   0 |   0 |       4 | stdgo.time._Time.Time_Fields_
    _partitionOrdered              |   0.001 |   0 |   0 |       3 | stdgo.slices._Slices.Slices_Fields_
    _partialInsertionSortOrdered   |   0.001 |   0 |   0 |       3 | stdgo.slices._Slices.Slices_Fields_
    _pdqsortOrdered                |   0.001 |   0 |   0 |       9 | stdgo.slices._Slices.Slices_Fields_
    _siftDownOrdered               |   0.001 |   0 |   0 |       6 | stdgo.slices._Slices.Slices_Fields_
    _parseNanoseconds              |   0.001 |   0 |   0 |       5 | stdgo.time._Time.Time_Fields_
    defaultValue                   |   0.001 |   0 |   0 |      22 | stdgo.Go
    _partitionEqualOrdered         |   0.001 |   0 |   0 |       3 | stdgo.slices._Slices.Slices_Fields_
  typing                           |   0.360 |   2 |  18 |     104 | 
    sort                           |   0.117 |   1 |  33 |       1 | stdgo.slices._Slices.Slices_Fields_
    init                           |   0.115 |   1 |  32 |       1 | stdgo.internal.Macro
    define                         |   0.102 |   1 |  28 |       1 | haxe.macro.Compiler
    _parseRFC3339                  |   0.017 |   0 |   5 |       1 | stdgo.time._Time.Time_Fields_
    less                           |   0.004 |   0 |   1 |       1 | stdgo.cmp._Cmp.Cmp_Fields_
    store                          |   0.003 |   0 |   1 |       1 | stdgo.sync.atomic.Pointer__static_extension
  afterTyping                      |   0.083 |   1 |   4 |       2 | 
  add_types                        |   0.080 |   1 |   4 |     104 | 
  jit                              |   0.076 |   1 |   4 |    2091 | 
  flush                            |   0.032 |   0 |   2 |     104 | 
filters                            |   1.954 |  13 |  13 |       8 | 
parsing                            |   1.542 |  11 |  11 |     376 | 
finalize                           |   0.053 |   0 |   0 |       1 | 
null safety                        |   0.002 |   0 |   0 |       1 | 
--------------------------------------------------------------------
total                              |  14.612 | 100 | 100 | 1486913 |

hl 1.13.0

name                               | time(s) |   % |  p% |       # | info
--------------------------------------------------------------------
analyzer                           |  51.523 |  82 |  82 | 3385513 | 
generate                           |   3.422 |   5 |   5 |   21467 | 
  hl                               |   3.422 |   5 | 100 |   21467 | 
    opt                            |   1.380 |   2 |  40 |   21465 | 
    write                          |   0.210 |   0 |   6 |       1 | 
typing                             |   2.726 |   4 |   4 |    5689 | 
macro                              |   2.170 |   3 |   3 |    7159 | 
  execution                        |   1.447 |   2 |  67 |    4753 | 
    toInterface                    |   0.634 |   1 |  44 |    1654 | stdgo.Go
    setRef                         |   0.309 |   0 |  21 |     701 | stdgo.Go
    asInterface                    |   0.216 |   0 |  15 |     463 | stdgo.Go
    cfor                           |   0.068 |   0 |   5 |     307 | stdgo.Go
    pointer                        |   0.056 |   0 |   4 |     751 | stdgo.Go
    typeAssert                     |   0.050 |   0 |   3 |      64 | stdgo.Go
    str                            |   0.045 |   0 |   3 |     509 | stdgo.Go
    controlFlow                    |   0.025 |   0 |   2 |      20 | stdgo.internal.Macro
    _pdqsortOrdered                |   0.013 |   0 |   1 |       9 | stdgo.slices._Slices.Slices_Fields_
    typeEquals                     |   0.006 |   0 |   0 |      43 | stdgo.Go
    less                           |   0.003 |   0 |   0 |      45 | stdgo.cmp._Cmp.Cmp_Fields_
    copySlice                      |   0.002 |   0 |   0 |      61 | stdgo.Go
    _medianAdjacentOrdered         |   0.002 |   0 |   0 |       9 | stdgo.slices._Slices.Slices_Fields_
    _choosePivotOrdered            |   0.002 |   0 |   0 |       3 | stdgo.slices._Slices.Slices_Fields_
    _partitionOrdered              |   0.001 |   0 |   0 |       3 | stdgo.slices._Slices.Slices_Fields_
    _parseRFC3339                  |   0.001 |   0 |   0 |       4 | stdgo.time._Time.Time_Fields_
    _order2Ordered                 |   0.001 |   0 |   0 |       9 | stdgo.slices._Slices.Slices_Fields_
    _partialInsertionSortOrdered   |   0.001 |   0 |   0 |       3 | stdgo.slices._Slices.Slices_Fields_
    _siftDownOrdered               |   0.001 |   0 |   0 |       6 | stdgo.slices._Slices.Slices_Fields_
  typing                           |   0.361 |   1 |  17 |     104 | 
    sort                           |   0.145 |   0 |  40 |       1 | stdgo.slices._Slices.Slices_Fields_
    init                           |   0.111 |   0 |  31 |       1 | stdgo.internal.Macro
    define                         |   0.078 |   0 |  22 |       1 | haxe.macro.Compiler
    _parseRFC3339                  |   0.018 |   0 |   5 |       1 | stdgo.time._Time.Time_Fields_
    less                           |   0.004 |   0 |   1 |       1 | stdgo.cmp._Cmp.Cmp_Fields_
    store                          |   0.002 |   0 |   1 |       1 | stdgo.sync.atomic.Pointer__static_extension
  afterTyping                      |   0.136 |   0 |   6 |       2 | 
  add_types                        |   0.099 |   0 |   5 |     104 | 
  jit                              |   0.089 |   0 |   4 |    2091 | 
  flush                            |   0.038 |   0 |   2 |     104 | 
filters                            |   1.769 |   3 |   3 |       8 | 
parsing                            |   1.485 |   2 |   2 |     377 | 
finalize                           |   0.050 |   0 |   0 |       1 | 
null safety                        |   0.002 |   0 |   0 |       1 | 
--------------------------------------------------------------------
total                              |  63.147 | 100 | 100 | 3420216 |
Simn commented 12 months ago

Thanks for the update! I spent some time on an approach that would batch-fuse locals, but this didn't work out unfortunately. Will have to revisit this again and think of a different way to handle this.