HaxeFoundation / haxe

Haxe - The Cross-Platform Toolkit
https://haxe.org
6.14k stars 656 forks source link

Exponential increase in generated code size due to array matching #11145

Open Aurel300 opened 1 year ago

Aurel300 commented 1 year ago

(Loose follow up to #6731)

Here's a (silly) way to find the winner of a tic-tac-toe board:

class Main {
  public static function main():Void {
    trace(matchBoard([[X, O, X], [X, O, X], [X, O, X]]));
  }

  static function matchBoard(board:Array<Array<Tile>>):Tile {
    return (switch (board) {
      case [[X, X, X], _, _]: X; // line 8
      case [_, [X, X, X], _]: X;
      case [_, _, [X, X, X]]: X;
      case [[X, _, _], [X, _, _], [X, _, _]]: X;
      case [[_, X, _], [_, X, _], [_, X, _]]: X;
      case [[_, _, X], [_, _, X], [_, _, X]]: X;
      case [[X, _, _], [_, X, _], [_, _, X]]: X;
      case [[_, _, X], [_, X, _], [X, _, _]]: X;
      case [[O, O, O], _, _]: O;
      case [_, [O, O, O], _]: O;
      case [_, _, [O, O, O]]: O;
      case [[O, _, _], [O, _, _], [O, _, _]]: O;
      case [[_, O, _], [_, O, _], [_, O, _]]: O;
      case [[_, _, O], [_, _, O], [_, _, O]]: O;
      case [[O, _, _], [_, O, _], [_, _, O]]: O;
      case [[_, _, O], [_, O, _], [O, _, _]]: O; // line 23
      case _: N;
    });
  }
}

enum Tile {
  X;
  O;
  N;
}

With the full example, the generated code mostly consists of that switch statement. I measured the size of the generated Javascript code (compiled with --dce full) when adding line 8 only, then line 8-9, then line 8-10, etc. The size increases very quickly:

 0:     3077 out.js
 1:     3468 out.js
 2:     4612 out.js
 3:     9524 out.js
 4:     9524 out.js
 5:    10517 out.js
 6:    12717 out.js
 7:    13489 out.js
 8:    14266 out.js
 9:    22991 out.js
10:    43741 out.js
11:    72983 out.js
12:    72983 out.js
13:   114628 out.js
14:   185129 out.js
15:   190342 out.js
16:   204627 out.js

Looking at the generated code I see unreasonable amounts of code like this:

                                default:
                                    if(_g2.length == 3) {
                                        var _g6 = _g2[1];
                                        var _g7 = _g2[2];
                                        switch(_g2[0]._hx_index) {
                                        case 0:
                                            if(_g6._hx_index == 0) {
                                                switch(_g7._hx_index) {
                                                case 0:
                                                    return Tile.X;
                                                case 1:
                                                    return Tile.X;
                                                default:
                                                    return Tile.X;
                                                }
                                            } else {
                                                return Tile.X;
                                            }
                                            break;
                                        case 1:
                                            if(_g6._hx_index == 1) {
                                                if(_g7._hx_index == 1) {
                                                    return Tile.O;
                                                } else {
                                                    return Tile.N;
                                                }
                                            } else if(_g7._hx_index == 1) {
                                                if(_g5._hx_index == 1) {
                                                    return Tile.O;
                                                } else {
                                                    return Tile.N;
                                                }
                                            } else {
                                                return Tile.N;
                                            }
                                            break;
                                        default:
                                            if(_g7._hx_index == 1) {
                                                if(_g5._hx_index == 1) {
                                                    return Tile.O;
                                                } else {
                                                    return Tile.N;
                                                }
                                            } else {
                                                return Tile.N;
                                            }

Can we do better?

Simn commented 1 year ago

Can we do better?

Regarding code size, there are certainly strategies to reduce it. This wouldn't be "for free" though, especially not if we consider the time somehow would have to spend on improving this. I'm quite interested in this from an academic point of view, but at some point it becomes difficult to justify the time investment.

This would require completely reviewing how bindings (and state in general) is handled. The code generation would also have to change drastically, perhaps to something resembling a coroutine. If you have any knowledge to share in this regard, please let me know!

Something that makes all this quite unsatisfying is that any such solution would likely lead to degraded performance for common cases. Whether or not that's justifiable is difficult to say, and perhaps the compiler could even decide on a code generation scheme depending on various factors.