beacon-biosignals / Legolas.jl

Tables.jl-friendly mechanisms for constructing, reading, writing, and validating Arrow tables against extensible, versioned, user-specified schemas.
Other
24 stars 2 forks source link

incorrect type assertions are accidentally imposed onto parent-derived fields in generated record constructor definitions #76

Closed jrevels closed 1 year ago

jrevels commented 1 year ago

Consider:

using Legolas

Legolas.@schema "unconstrained-field" UnconstrainedField

Legolas.@version UnconstrainedFieldV1 begin
    field::Any
end

Legolas.@schema "constrained-field" ConstrainedField

Legolas.@version ConstrainedFieldV1 > UnconstrainedFieldV1 begin
    field::Int = parse(Int, field)
end

ConstrainedFieldV1(field = "1")

This should return a ConstrainedFieldV1 instance where field == 1, but instead throws an error:

ERROR: MethodError: Cannot `convert` an object of type String to an object of type Int64

This is because the generated inner constructor definition for ConstrainedFieldV1 looks like this (note that you can inspect this via Legolas._generate_record_type_definitions(ConstrainedFieldV1SchemaVersion(), :ConstrainedFieldV1)):

function ConstrainedFieldV1(; field = missing)
    __parent__ = UnconstrainedFieldV1(; field)
    field = __parent__.field
    field::Int = parse(Int, field)
    return new(field)
end

If the problem here is not apparent, recall from the documented behavior of the :: operator:

In local scope, the syntax local x::T or x::T = expression declares that local variable x always has type T. When a value is assigned to the variable, it will be converted to type T by calling convert.

In other words, the above inner constructor definition becomes roughly equivalent to:

function ConstrainedFieldV1(; field = missing)
   local field::Int
    __parent__ = UnconstrainedFieldV1(; field)
    field = convert(Int, __parent__.field)::Int # convert/typeassert due to `local field::Int` declaration
    field = convert(Int, parse(Int, field))::Int # convert/typeassert due to `local field::Int` declaration
    return new(field)
end

Thus yielding the MethodError when convert(Int, __parent__.field) is invoked.

To put it more generically: Legolas is accidentally injecting wrong assertions about schema versions' parents' field types.

IIUC, to fix this, we can change Legolas' generated field assignments in record constructors from:

field::T = <statement>

to:

field = convert(T, <statement>)::T

which will not impose incorrect assertions on destructured parent-derived fields