beacon-biosignals / Legolas.jl

Tables.jl-friendly mechanisms for constructing, reading, writing, and validating Arrow tables against extensible, versioned, user-specified schemas.
Other
24 stars 2 forks source link

Provide nice error messages when constructing record fields #83

Closed omus closed 1 year ago

omus commented 1 year ago

Fixes #44 but we'll probably want to spawn a new issue for https://github.com/beacon-biosignals/Legolas.jl/issues/44#issuecomment-1292609205. A quick example of this in action:

julia> using Legolas: Legolas, @schema, @version

julia> @schema "test.field-error" FieldError

julia> function _validate(x)
           x in ("a", "b", "c") || throw(ArgumentError("Must be a, b, or c"))
           return x
       end
_validate (generic function with 1 method)

julia> @version FieldErrorV1 begin
           a::Union{String,Missing} = Legolas.lift(_validate, a)
           b::(<:Union{String,Missing}) = Legolas.lift(_validate, b)
           c::Union{Integer,Missing}
           d::(<:Union{Integer,Missing})
       end

julia> FieldErrorV1(; a="invalid")
ERROR: ArgumentError: Invalid value set for field `a` ("invalid")
Stacktrace:
 [1] FieldErrorV1(; a::String, b::Missing, c::Missing, d::Missing)
   @ Main ~/.julia/dev/Legolas/src/schemas.jl:488
 [2] top-level scope
   @ REPL[5]:1

caused by: ArgumentError: Must be a, b, or c
Stacktrace:
 [1] _validate
   @ ./REPL[3]:2 [inlined]
 [2] lift
   @ ~/.julia/dev/Legolas/src/lift.jl:12 [inlined]
 [3] FieldErrorV1(; a::String, b::Missing, c::Missing, d::Missing)
   @ Main ~/.julia/dev/Legolas/src/schemas.jl:511
 [4] top-level scope
   @ REPL[5]:1

I ended up going with this stacked exception approach as it allows for field constructors to easily report issues without having to include the additional context of the name of the field or the value itself.

Update: Modified example to show current exception output


ericphanson commented 1 year ago

In the example,

ERROR: ArgumentError: Invalid value set for field a, expected Union{Missing, String}, got a value of type String ("invalid")

feels a little misleading because "expected" seems to imply it got something unexpected, whereas in this case it did get the expected type.

omus commented 1 year ago

In the example,

ERROR: ArgumentError: Invalid value set for field a, expected Union{Missing, String}, got a value of type String ("invalid")

feels a little misleading because "expected" seems to imply it got something unexpected, whereas in this case it did get the expected type.

Reasonable. I can do some revision there.

omus commented 1 year ago

I can reproduce the error locally now that I've merged against main

omus commented 1 year ago

This PR uncovered that https://github.com/beacon-biosignals/Legolas.jl/pull/82 introduced a very subtle bug where if you are using a parametric schema and when you attempt to construct a record where the value of a parametric field is not an expected subtype you get an error like this:

TypeError: in FieldErrorV1, in 326, expected var\"326\"<:Union{Missing, Integer}, got Type{Float64

Instead of the nice ones generated in this PR. I've reverted the change from #82 here as it was just a minor bit of cleanup. I do think I should add a test explicitly for this problem but I'll do that in another PR.

omus commented 1 year ago

I looked into the coverage reduction and it appears that the $fcatch lines are not being marked as lines which have been tested. As all of the tests I added here are about testing those lines I think we can safely ignore the coverage drop

omus commented 1 year ago

Merging in main uncovered and issue with automatic conversion when parametric fields are used. I'll try to fix that

omus commented 1 year ago

Fixed the conversion issue but found something else I'll want to address in a follow up:

julia> using Legolas: Legolas, @schema, @version

julia> @schema "test.demo" Demo

julia> @version DemoV1 begin
           b::(<:Union{Integer,Missing})
       end

julia> DemoV1{Int}(b=1.0)  # Used to fail with the changes in this PR
DemoV1{Int64}: (b = 1,)

julia> DemoV1{Float64}  # Use of gensym makes this ugly
ERROR: TypeError: in DemoV1, in 312, expected var"312"<:Union{Missing, Integer}, got Type{Float64}
Stacktrace:
 [1] top-level scope
   @ REPL[6]:1

In #81 when I switched this code to using gensym I made this error less readable. Unfortunately as we allow people to write custom code for the field constructors even if we use _$(field_name_T as the type parameter it's possible for user code to reassign these parameters.

Personally, I'm inclined to remove the parametric record constructors as I'm not sure when they would ever be used but that is definitely a breaking change we can deal with later.

kleinschmidt commented 1 year ago

Use of gensym makes this ugly

Could you use gensym with the fieldname as a string or something like that to make a slightly more readable generated symbol

omus commented 1 year ago

Could you use gensym with the fieldname as a string or something like that to make a slightly more readable generated symbol

We were using gensym and passing in the field name. Something else is going on:

julia> T = gensym("foo")
Symbol("##foo#312")

julia> ex = quote
           struct Foo{$T <: Integer}
               x::$T
           end
       end
quote
    #= REPL[2]:2 =#
    struct Foo{var"##foo#312" <: Integer}
        #= REPL[2]:3 =#
        x::var"##foo#312"
    end
end

julia> eval(ex)

julia> Foo{String}
ERROR: TypeError: in Foo, in 312, expected var"312"<:Integer, got Type{String}
Stacktrace:
 [1] top-level scope
   @ REPL[4]:1

julia> Meta.dump(ex)
Expr
  head: Symbol block
  args: Array{Any}((2,))
    1: LineNumberNode
      line: Int64 2
      file: Symbol REPL[2]
    2: Expr
      head: Symbol struct
      args: Array{Any}((3,))
        1: Bool false
        2: Expr
          head: Symbol curly
          args: Array{Any}((2,))
            1: Symbol Foo
            2: Expr
              head: Symbol <:
              args: Array{Any}((2,))
                1: Symbol ##foo#312
                2: Symbol Integer
        3: Expr
          head: Symbol block
          args: Array{Any}((2,))
            1: LineNumberNode
              line: Int64 3
              file: Symbol REPL[2]
            2: Expr
              head: Symbol ::
              args: Array{Any}((2,))
                1: Symbol x
                2: Symbol ##foo#312

Reported this as a Julia bug: https://github.com/JuliaLang/julia/issues/49335