ballerina-platform / ballerina-lang

The Ballerina Programming Language
https://ballerina.io/
Apache License 2.0
3.57k stars 738 forks source link

[Task]: Support `key-constraint` in table semtype #42657

Closed lochana-chathura closed 3 weeks ago

lochana-chathura commented 4 months ago

Description

$subject.

In the current nBallerina table semtype implementation key-constraint is not supported. Ref: https://github.com/ballerina-platform/ballerina-lang/pull/42639#discussion_r1582795300

lochana-chathura commented 2 months ago

In general, table<C1> key(KS1)/key<KC1> <: table<C2> key(KS2)/key<KC2> iff table<C1> <: table<C2> & KeySemType1 <: KeySemType2

Keyless table table<C> can be modeled as C[]. Thefore the keyed-table can be modeled using a tuple [C[], KST]

How do we model KST?

Basically, we encounter 4 scenarios when performing subtype checking.

Sincetable<C> key(KS)/key<KC> <: table<C>, in all cases, KS = undef and KC=never/undef, KST has to be modeled as ANYDATA(key's top type) or ANY(simpler).

Case 1: key(f'1, f'2, ...) <: key(f"1, f"2, ...) iff KST1 <: KST2

where KST is either record {|T1 f1; T2 f2; ...; any...;|} or [[T1, "f1"], [T2, "f2"], ..., [any, string]...]

Case 2: key<KC1> <: key<KC2> iff KST1 <: KST2

where KST = KC. This is straightforward.

Case 3: key(f1, f2, ...) <: key<KC> iff KST1 <: KST2

KST1 is either record {|T1 f1; T2 f2; ... any...;|} or [[T1, "f1"], [T2, "f2"], ..., [any, string]...]

KST2? To hold the subtyping KC has to be a tuple where [T1, T2, ...,] is a subtype. So, if KC is a tuple. KST2 can be modeled as [[T'1, any], [T'2, any], ..., [any, any]...], else KST = KC

Case 4: key<KC> <: key(f1, f2, ..) iff KST1 <: KST2

KST2 is either record {|T1 f1; T2 f2; ... any...;|} or [[T1, "f1"], [T2, "f2"], ..., [any, string]...]

KST1? If KC is a tuple it is modeled as [[T1, "f1"], [T2, "f2"], ..., [any, string]...], else KST1=KC

lochana-chathura commented 1 month ago
// CASE 1: 

type Employee record {|
    readonly string name;
    readonly string department;
    int salary;
|};

type Employee2 record {|
    readonly [string, string] name;
    int salary;
|};

public function case1() {
    table<Employee> key<[string]> t1 = table key(name)[
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "",salary: 200}
    ]; // ALLOWED

    table<Employee2> key<[string, string]> t3 = table key(name)[
        {name: ["John", ""], salary: 100},
        {name: ["Jane", ""],salary: 200}
    ]; // NOT ALLOWED.

    table<Employee2> key<[[string, string]]> t2 = table key(name)[
        {name: ["John", ""], salary: 100},
        {name: ["Jane", ""],salary: 200}
    ]; // ALLOWED.
}

// Conclusion: KC being a tuple has special meaning. Tuple fixed member size = no of items in composite key
lochana-chathura commented 1 month ago
// CASE 2:

type Employee record {|
    readonly string name;
    readonly string department;
    int salary;
|};

public function case2() {
    table<Employee> key<[string]> a = table key(name) [
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "",salary: 200}
    ];

    table<Employee> key<string> a1 = a; // NOT ALLOWED. BUG?

    table<Employee> key<string> b = table key(name) [
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "",salary: 200}
    ];

    table<Employee> key<[string]> b1 = b; // NOT ALLOWED. BUG?
}
lochana-chathura commented 1 month ago
// CASE 3:

type Employee record {|
    readonly string name;
    readonly string department;
    int salary;
|};

public function case3() {
    table<Employee> key(name, department) a = table key(name, department) [
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "", salary: 200}
    ];

    table<Employee> key(name) a1 = a; // NOT ALLOWED. BUG?

    table<Employee> key<[string, string]> b = table key(name, department) [
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "", salary: 200}
    ];

    table<Employee> key<string> b1 = b; // NOT ALLOWED. BUG? 
    table<Employee> key<[string]> b2 = b; // NOT ALLOWED. BUG? 

    table<Employee> key(name) c = table key(name) [
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "", salary: 200}
    ];

    table<Employee> key(name, department) c1 = c; // NOT ALLOWED. OK.

    table<Employee> key<[string]> d = table key(name) [
        {name: "John", department: "", salary: 100},
        {name: "Jane", department: "", salary: 200}
    ];

    table<Employee> key<[string, string]> d1 = d; // NOT ALLOWED. OK.
}
lochana-chathura commented 1 month ago

CASE III is not a BUG. e.g.

table<Employee> key(name, department) a = table key(name, department) [
    {name: "John", department: "IT", salary: 100},
    {name: "John", department: "finance", salary: 200}
];

This being assignable to table<Employee> key(name) would not make the name a unique key.

lochana-chathura commented 1 month ago

CASE III is not a BUG. e.g.

Based on the above, the new model would be,

In general, table<C1> key(KS1)/key<KC1> <: table<C2> key(KS2)/key<KC2> iff table<C1> <: table<C2> & KeySemType1 <: KeySemType2

Keyless table table<C> can be modeled as C[]. Thefore the keyed-table can be modeled using a tuple [C[], KST]

How do we model KST?

Basically, we encounter 4 scenarios when performing subtype checking.

Sincetable<C> key(KS)/key<KC> <: table<C>, in all cases, KS = undef and KC=never/undef, KST has to be modeled as ANYDATA(key's top type) or ANY(simpler).

Case 1: key(f'1, f'2, ...) <: key(f"1, f"2, ...) iff KST1 <: KST2

where KST is either record {|T1 f1; T2 f2; ...;|} or [[T1, "f1"], [T2, "f2"], ...]

Case 2: key<KC1> <: key<KC2> iff KST1 <: KST2

where KST = KC. This is straightforward.

Case 3: key(f1, f2, ...) <: key<KC> iff KST1 <: KST2

KST1 is either record {|T1 f1; T2 f2; ...; |} or [[T1, "f1"], [T2, "f2"], ...]

KST2?. If KC is a tuple. KST2 can be modeled as [[T'1, any], [T'2, any], ...], else KST = KC

Case 4: key<KC> <: key(f1, f2, ..) iff KST1 <: KST2

KST2 is either record {|T1 f1; T2 f2; ...;|} or [[T1, "f1"], [T2, "f2"], ...]

KST1? If KC is a tuple it is modeled as [[T'1, "f1"], [T'2, "f2"], ....], else KST1=KC

lochana-chathura commented 1 month ago

Documented summary can be found here.

lochana-chathura commented 1 month ago

Based on ballerina-platform/ballerina-spec#1310, revised table semtype model can be found here.

lochana-chathura commented 3 weeks ago

Closing with #43092