Roger-luo / Configurations.jl

Options & Configurations made easy.
https://configurations.rogerluo.dev/stable
MIT License
80 stars 12 forks source link

Can we make from_dict more smart in dealt with parametric option struct? #78

Closed liuyxpp closed 2 years ago

liuyxpp commented 2 years ago

Say we have defined an option and two structs

struct A end
struct B end
@option struct MyConfig{T}
           a::T=A()
           b::Bool=true
end

And a dict

d = Dict("a"=>B(), "b"=>false)

If we parse the dict as

config = from_dict(MyConfig, d)

We got the following error

ERROR: ArgumentError: expect a concrete type, got MyConfig
Stacktrace:
 [1] from_dict(::Type{MyConfig}, d::Dict{String, Any}; kw::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ Configurations ~/.julia/packages/Configurations/1TQp9/src/from_dict.jl:35
 [2] from_dict(::Type{MyConfig}, d::Dict{String, Any})
   @ Configurations ~/.julia/packages/Configurations/1TQp9/src/from_dict.jl:34
 [3] top-level scope
   @ REPL[6]:1

However, the following code will succeed

julia> from_dict(MyConfig{typeof(d["a"])}, d)
MyConfig{B}(B(), false)

Can we make this process more automatic?

Another related issue is, after we serialized config to a YAML file. Read back the file and using from_dict directly will also trigger an error. Since B() is read in as a string instead of a struct. What is the best approach for reading struct that is not supported by YAML? A dirty approach I thought out is to use Meta.parse and then eval the parsed code and replace the original key value pair.

Roger-luo commented 2 years ago

This is a subtle thing and it is actually a feature. The reason why it should be a concrete type is because we will need the full type information to decide the type conversion behavior instead of just assigning the value to the fields which can be wrong.

You should define a custom type conversion for B, since Julia type is not native objects in YAML or TOML.

Roger-luo commented 2 years ago

Let me know if the doc is not clear on this https://configurations.rogerluo.dev/stable/convert/

liuyxpp commented 2 years ago

Sure, I do know the doc of conversion. However, it seems the approach is infeasible or nonscalable when there are many parametric arguments in the type definition. Say if we have two:

@option struct MyConfigNew{T, S}
    a::T=A()
    b::S=B()
end

Now I have to define the conversions for all possible types for b field as

Configurations.from_dict(::Type{MyConfigNew{A, Bool}}, ::Type{A}, s) = eval(Meta.parse(s))
Configurations.from_dict(::Type{MyConfigNew{A, Int}}, ::Type{A}, s) = eval(Meta.parse(s))
Configurations.from_dict(::Type{MyConfigNew{A, Float64}}, ::Type{A}, s) = eval(Meta.parse(s))
Configurations.from_dict(::Type{MyConfigNew{A, B}}, ::Type{A}, s) = eval(Meta.parse(s))
...
...
...

However, the type of b field might be unknown to the package I developed. This is exactly the purpose of letting its type be parametric. And My configurations are more complicated, it looks like

@option struct Config{AT, KT, T, NT, NT2, TMT, S}
    algo::AT=Euler()
    kwargs::KT=(;)
    min_λ::T=0.02
    λrate::Float64=0.9  # Reduce λ by λrate when restarting a simulation.
    atom_iter::Int=1  # number of iterations for IterationControl.Step.
    min_iter::Int=100  # minimum number of iterations
    max_iter::Int=2000  # maximum number of iterations
    skip_iter::Int=100  # number of iterations to skip for IterationControl
    max_restart::Int=1  # maximum number of restarting simulaiton.
    maxtime::Float64=0.5  # maximum time allowed for one simulation (in hour unit).
    norm::NT=x->norm(x,Inf)  # for each force
    norm2::NT2=mean  # for errors of forces
    relative::Bool=false  # is relative residual error?
    maxΔx::S=0.15  # minimum spatial grid resolution, in unit of Rg
    pow2Nx::Bool=false  # is number of spatial grid along one dimension a power of 2?
    tolmode::TMT=TolModeResidual()  # tol for F, residual, or incomp?
    tol::S=1e-5  # the target tolerance
    maxtol::S=1e-3  # the maximum allowed target tolerance, should >= tol
    dangertol::S=1.0  # when the error > dangertol, the simulation does not converge, meaning that the relaxation parameters are too large.
    k_slow::Int=100  # for SlowProgress control
    tol_slow::S=√eps(1.0)  # for SlowProgress control
    k_osc::Int=100  # for OscillatoryProgress control
    tol_osc::S=1.0  # for OscillatoryProgress control
    m_osc::Int=100  # for OscillatoryProgress control
    tolF::S=√eps(1.0) # for SlowProgress control
    numbest::Int=10  # number of iterations since best error: skip_iter * numbest
    numpatience::Int=10  # number of iterations allowed when error increases: skip_iter * numpatience
    use_log_control::Bool=false
    use_slow_control::Bool=true
    use_osc_control::Bool=true
    use_nsbest_control::Bool=false
    use_patience_control::Bool=false
end
Roger-luo commented 2 years ago

I don't see a way to get the right type info. I'm happy to see a PR solving this problem in general instead of your specific case.

But the real problem is YAML/JSON/TOML cannot represent a Julia type you have to implement your own encoding as string. Which is not a general available method.

You are mixing up the config semantics with Julia semantics here. E.g you can use Union or Enum to limit how string is parsed into a Julia type instead of arbitrary type parameter. The option type semantic is ONLY a subset of Julia in order to be compatible with markup languages.

But again if you have better solution I'm happy to review a PR.

Roger-luo commented 2 years ago

Also your implementation using eval is VERY dangerous please try not to implement parsing in this way. It is gonna be very easy to make the config file unsafe and execute arbitrary code. I just suggest you using either enum type, option type alias or custom conversion.

One example is you can change your algo to a enum type instead of using a type parameter.

Also just one personal suggestion if you have too much type parameters in one type it usually means something wrong with your abstraction. It would slow down any type of compiler processing your type in general. If you are able to split them into small types then one of the most powerful feature of this package is allowing you to define composable config types, which would make your life much easier.

liuyxpp commented 2 years ago

Thanks for your suggestions. I will close this issue.