ballerina-platform / ballerina-lang

The Ballerina Programming Language
https://ballerina.io/
Apache License 2.0
3.68k stars 751 forks source link

[Bug]: Method too large error with few map field access lines of multi dimension maps #41052

Open gabilang opened 1 year ago

gabilang commented 1 year ago

Description

$title

Steps to Reproduce

import ballerina/test;

type Teacher record {
    string name;
    int age;
    string status;
    string batch;
    string school;
};

type Employee record {
    string name;
    string status;
    string batch;
};

type EmployeeDoubleMap map<map<Employee>>;

function stampRecordTypeMultiDimensionMap() {
    Teacher p1 = {name: "Raja", age: 25, status: "single", batch: "LK2014", school: "Hindu College"};
    Teacher p2 = {name: "Mohan", age: 30, status: "single", batch: "LK2014", school: "Hindu College"};
    map<Teacher> teacherMap = {"a": p1, "b": p2};
    map<map<Teacher>> multiMap = {"aa": teacherMap, "bb": teacherMap};
    map<map<Employee>> mapValue = checkpanic multiMap.cloneWithType(EmployeeDoubleMap);
    test:assertEquals(mapValue["aa"]["a"]["name"], "Raja");
    test:assertEquals(mapValue["aa"]["a"]["age"], 25);
    test:assertEquals(mapValue["aa"]["a"]["status"], "single");
    test:assertEquals(mapValue["aa"]["a"]["batch"], "LK2014");
    test:assertEquals(mapValue["aa"]["a"]["school"], "Hindu College");
    test:assertEquals(mapValue["aa"]["b"]["name"], "Mohan");
    test:assertEquals(mapValue["aa"]["b"]["age"], 30);
    test:assertEquals(mapValue["aa"]["b"]["status"], "single");
    test:assertEquals(mapValue["aa"]["b"]["batch"], "LK2014");
    test:assertEquals(mapValue["aa"]["b"]["school"], "Hindu College");
    test:assertEquals(mapValue["bb"]["a"]["name"], "Raja");
    test:assertEquals(mapValue["bb"]["a"]["age"], 25);
    test:assertEquals(mapValue["bb"]["a"]["status"], "single");
    test:assertEquals(mapValue["bb"]["a"]["batch"], "LK2014");
    test:assertEquals(mapValue["bb"]["a"]["school"], "Hindu College");
    test:assertEquals(mapValue["bb"]["b"]["name"], "Mohan");
    test:assertEquals(mapValue["bb"]["b"]["age"], 30);
    test:assertEquals(mapValue["bb"]["b"]["status"], "single");
    test:assertEquals(mapValue["bb"]["b"]["batch"], "LK2014");
    test:assertEquals(mapValue["bb"]["b"]["school"], "Hindu College");
    test:assertEquals(mapValue.length(), 2);
    test:assertEquals((typeof mapValue).toString(), "typedesc EmployeeDoubleMap");

    test:assertEquals(mapValue["aa"]["a"]["name"], "Raja");
    test:assertEquals(mapValue["aa"]["a"]["age"], 25);
    test:assertEquals(mapValue["aa"]["a"]["status"], "single");
    test:assertEquals(mapValue["aa"]["a"]["batch"], "LK2014");
    test:assertEquals(mapValue["aa"]["a"]["school"], "Hindu College");
    test:assertEquals(mapValue["aa"]["b"]["name"], "Mohan");
    test:assertEquals(mapValue["aa"]["b"]["age"], 30);
}

Affected Version(s)

No response

OS, DB, other environment details and versions

No response

Related area

-> Runtime

Related issue(s) (optional)

No response

Suggested label(s) (optional)

No response

Suggested assignee(s) (optional)

No response

HindujaB commented 1 year ago

For the below sample,

type Rec record {|
    string name;
|};

public function main() {
    Rec p1 = {name: "Raja"};
    map<Rec> OneDMap = {"a": p1};
    string? s1 = OneDMap["a"]["name"];
}

The main method BIR contains around 21 local vars and 26 basic blocks. This happens due to the desugaring we do on the map access expressions. For each access expression, we recursively add match clauses to check the return values to change the control flow for null values. This creates multiple if-else conditions in the byte code.

Before desugaring,

Screenshot 2023-09-07 at 16 48 50

After desugaring,

Screenshot 2023-09-07 at 16 52 11

The following is the BIR of the above code.

public main function() -> () {
    %0(RETURN) ();
    %1(LOCAL) Rec;
    %2(TEMP) typeDesc<any|error>;
    %4(TEMP) string;
    %5(TEMP) string;
    %6(LOCAL) map<Rec>;
    %11(LOCAL) string|();
    %12(SYNTHETIC) string|();
    %13(SYNTHETIC) Rec|();
    %17(SYNTHETIC) Rec|();
    %19(SYNTHETIC) boolean;
    %20(SYNTHETIC) boolean;
    %21(SYNTHETIC) any|error;
    %22(TEMP) boolean;
    %34(SYNTHETIC) boolean;
    %35(SYNTHETIC) boolean;
    %36(SYNTHETIC) any|error;
    %49(TEMP) Rec;
    %52(SYNTHETIC) boolean;
    %53(SYNTHETIC) boolean;
    %63(TEMP) ();

    bb0 {
        %2 = newType Rec;
        %4 = ConstLoad name;
        %5 = ConstLoad Raja;
        %1 = NewMap %2;
        %2 = newType map<Rec>;
        %4 = ConstLoad a;
        %6 = NewMap %2;
        %5 = ConstLoad a;
        %13 = %6[%5];
        %17 = %13;
        %22 = ConstLoad true;
        %22? bb1 : bb2;
    }
    bb1 {
        %20 = ConstLoad true;
        %21 = <any|error> %17;
        GOTO bb3;
    }
    bb2 {
        %20 = ConstLoad false;
        GOTO bb3;
    }
    bb3 {
        %20? bb4 : bb5;
    }
    bb4 {
        %19 = %21 is ();
        GOTO bb6;
    }
    bb5 {
        %19 = ConstLoad false;
        GOTO bb6;
    }
    bb6 {
        %19? bb7 : bb8;
    }
    bb7 {
        %12 = <string|()> %21;
        GOTO bb24;
    }
    bb8 {
        %22 = ConstLoad true;
        %22? bb9 : bb10;
    }
    bb9 {
        %35 = ConstLoad true;
        %36 = <any|error> %17;
        GOTO bb11;
    }
    bb10 {
        %35 = ConstLoad false;
        GOTO bb11;
    }
    bb11 {
        %35? bb12 : bb13;
    }
    bb12 {
        %34 = %36 is Rec;
        GOTO bb14;
    }
    bb13 {
        %34 = ConstLoad false;
        GOTO bb14;
    }
    bb14 {
        %34? bb15 : bb16;
    }
    bb15 {
        %49 = <Rec> %36;
        %4 = ConstLoad name;
        %5 = %49[%4];
        %12 = <string|()> %5;
        GOTO bb24;
    }
    bb16 {
        %22 = ConstLoad true;
        %22? bb17 : bb18;
    }
    bb17 {
        %53 = ConstLoad true;
        GOTO bb19;
    }
    bb18 {
        %53 = %17 is any;
        GOTO bb19;
    }
    bb19 {
        %53? bb20 : bb21;
    }
    bb20 {
        %52 = ConstLoad true;
        GOTO bb22;
    }
    bb21 {
        %52 = ConstLoad false;
        GOTO bb22;
    }
    bb22 {
        %52? bb23 : bb24;
    }
    bb23 {
        %63 = ConstLoad 0;
        %12 = <string|()> %63;
        GOTO bb24;
    }
    bb24 {
        %11 = %12;
        %0 = ConstLoad 0;
        GOTO bb25;
    }
    bb25 {
        return;
    }
}

Some of the branches are never executed here because the map access keys are from the required fields of the record. To reduce the number of instructions we generate, we need to optimally desugar it.

I tested improving the BIR Optimizer to reduce the redundant variable assignments. The optimization is not fixing the original problem even though it reduced the number of local vars slightly. At the BIR generation, we don't have control over the map access instruction to optimize it further.