bhftbootcamp / Serde.jl

Serde is a Julia library for (de)serializing data to/from various formats. The library offers a simple and concise API for defining custom (de)serialization behavior for user-defined types
Apache License 2.0
48 stars 9 forks source link

Skip empty values handling when fields cannot be nothing #44

Closed NeroBlackstone closed 3 months ago

NeroBlackstone commented 5 months ago

Pull request checklist

Related issue

Hi! I read the source code and it's easy to understand. I realized that we don't need to apply empty values ​​handling specifically for specified fields, just skip those non-nullable fields during empty values ​​handling.

codecov-commenter commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 71.83%. Comparing base (85e37b4) to head (839e035).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #44 +/- ## ========================================== + Coverage 71.78% 71.83% +0.05% ========================================== Files 24 24 Lines 1056 1058 +2 ========================================== + Hits 758 760 +2 Misses 298 298 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

dmitrii-doronin commented 5 months ago

Hello, again! nulltype does not necessary mean nothing or missing. You can overload this function to achieve behaviour like 0.00001 (very small float) ->(after deser)-> 0. Does it still support that case?

NeroBlackstone commented 5 months ago

Hello, again! nulltype does not necessary mean nothing or missing. You can overload this function to achieve behaviour like 0.00001 (very small float) ->(after deser)-> 0. Does it still support that case?

Hello! I'm not sure I understand exactly what you mean. If possible, please provide a few specific test cases, thank you!

dmitrii-doronin commented 4 months ago

Hi, @NeroBlackstone! I've been very busy lately but here's a sample I frequently see in line of my work. Does your code cover this case?

using Serde

const Maybe{T} = Union{Nothing, T}

struct Test 
    b::Maybe{String}
end

json = """{"b": ""}"""

Serde.isempty(::Type{Test}, x::String) = isempty(x)

Serde.deser_json(Test, json)

Maybe{String} <: Nothing # false
gryumov commented 4 months ago

@NeroBlackstone, hi, i expect that such a case should not be deserialized if we perform a check using ((Nothing <: fieldtype) || (Missing <: fieldtype)) && isempty(T, value), we won't encounter an error and the value will simply be NA.

using Serde

struct Foo 
    b::String
end

json = """{"b": "NA"}"""

Serde.isempty(::Type{Foo}, x::String) = x == "NA"

julia> Serde.deser_json(Foo, json)
ERROR: ParamError: parameter 'b::String' was not passed or has the value 'nothing'

But I understand your problem and we need to solve it differently.

gryumov commented 4 months ago

Hey @NeroBlackstone, what do you think about this?☝️

NeroBlackstone commented 4 months ago

Sorry for the delayed reply.

I think I still don't understand the expected behavior we want:

If there is a non-empty field in the type to be deserialized, and we define Serde.isempty for field type, should the non-empty field be serialized correctly, or a ParamError be thrown?

using Serde

struct Foo 
    b::String
end

json = """{"b": "NA"}"""

Serde.isempty(::Type{Foo}, x::String) = x == "NA"

julia> Serde.deser_json(Foo, json)
ERROR: ParamError: parameter 'b::String' was not passed or has the value 'nothing'

I tested this code on Serde 3.0.4, and the output is consistent with yours. Is this the expected behavior we want?

If the above code is the expected behavior, then:

struct Foo2
           zero::Int
           cant_zero::Union{Nothing,Int}
end
function Serde.isempty(::Type{Foo2}, x::Int)
           return x == 0
end
julia> Serde.deser(Foo2,Dict("zero"=>0,"cant_zero"=>0))
ParamError: parameter 'zero::Int64' was not passed or has the value 'nothing'

This code also conforms to the expected behavior.

But obviously, once we define Serde.isempty for some field type, ALL fields that have same type will apply this rule.

Should we allow correct deserialization (like having the b field of the Foo type above be assigned "NA"), or restrict Serde.isempty to specific fields?

NeroBlackstone commented 4 months ago

I think specifying fields is better than ignoring fields.

So we can apply different empty handling for the same field type.

For example:

struct Foo
        a::Union{String, Nothing} # "a" will convert to nothing 
        b::Union{String, Nothing} #  "b" will convert to nothing  
end

I think it's hard to implement because I don't know how to pass fields name when define Serde.isempty

A good choice is to let Serde.isempty return a Dict, key is field name we want to handle, value is anonymous function how we implement.

Serde.isempty(::type{Foo})
        return Dict(:a=> x->x=="a", :b => x->x=="b")
end

Serde invoke this function when deser to types.

gryumov commented 4 months ago

@NeroBlackstone If you add such logic, isempty could potentially allow a nullvalue to pass through, which might not be safe.

using Serde

function Serde.deser(
    ::Serde.CustomType,
    ::Type{D},
    data::AbstractDict{K,V},
)::D where {D<:Any,K<:Union{AbstractString,Symbol},V<:Any}
    vals = Any[]

    for (type, name) in zip(Serde._field_types(D), fieldnames(D))
        key = Serde.custom_name(D, Val(name))
        val = get(data, K(key), Serde.default_value(D, Val(name)))
        val = if isnothing(val) ||
             ismissing(val) ||
             (((Nothing <: type) || (Missing <: type)) && isempty(T, value))
            Serde.nulltype(type)
        else
            val
        end
        push!(vals, Serde.eldeser(D, type, key, val))
    end

    return D(vals...)
end

struct Foo 
    b::String
end

json = """{"b": "NA"}"""

Serde.isempty(::Type{Foo}, x::String) = x == "NA"

julia> Serde.deser_json(Foo, json)
Foo("NA")
gryumov commented 4 months ago

BoundTypes essentially serves as an extended configuration of isempty. @NeroBlackstone what do you think?

using Serde
using BoundTypes

struct Foo
    a::StringMinLength{String,1}
    b::String
end

julia> Serde.deser(Foo, Dict("a" => "dsds", "b" => ""))
Foo("dsds", "")

julia> Serde.deser(Foo, Dict("a" => "", "b" => ""))
ERROR: ConstraintError: constraints of type StringMinLength violated.
Wrong value: length of the "" must be at least 1 characters (0).
NeroBlackstone commented 3 months ago

BoundTypes essentially serves as an extended configuration of isempty. @NeroBlackstone what do you think?

BoundTypes is a great solution to add constraints to fields of a type. But I think it only solves part of the problem.

My question is, what's the solution to correctly deserialize the Foo type for the following simple use case:

using Serde

struct Foo
                  nonull_field1::Int
                  nonull_field2::Int
                  could_null::Union{Nothing,Int}
end

function Serde.isempty(::Type{Foo}, x::Int)
                  return x == 0
end

julia> Serde.deser(Foo,Dict("nonull_field1"=>0,"nonull_field2"=>0,"could_null"=>0))

# In Serde 3.1.0 :
ERROR: ParamError: parameter 'nonull_field1::Int64' was not passed or has the value 'nothing'

# Ideal result
Foo(0, 0, nothing)

Or we should not deserialize such cases correctly?

gryumov commented 3 months ago

I still think that isempty is a property of the type, not the field

using Serde

struct Foo
    nonull_field1::Int
    nonull_field2::Int
    could_null::Union{Nothing,Int}

    function Foo(
        nonull_field1::Int, 
        nonull_field2::Int, 
        could_null::Union{Nothing,Int}
    )
        return new(nonull_field1, nonull_field2, could_null == 0 ? nothing : could_null)
    end
end

julia> Serde.deser(Foo, Dict("nonull_field1" => 0, "nonull_field2" => 0, "could_null" => 0))
Foo(0, 0, nothing)
gryumov commented 3 months ago

@NeroBlackstone, I think we should close the PR for now since there's no clear solution yet

NeroBlackstone commented 3 months ago

@NeroBlackstone, I think we should close the PR for now since there's no clear solution yet

🫡Agree, feel free to close this PR. Thank you for your review and times.

gryumov commented 3 months ago

@NeroBlackstone, I think we should close the PR for now since there's no clear solution yet

🫡Agree, feel free to close this PR. Thank you for your review and times.

If you have any real-life cases to discuss in the future, I’m always ready. Thank you.