Closed fr293 closed 2 years ago
def weight_function(data_smooth, time_period):
# takes the smoothed data array and returns a weighting array of equal length
tail_proportion = 0.1
weight_slope = 1
measured_time = rh.gettime(data_smooth)
time_size = measured_time.size
# calculate the base proportion excluding the initial 10 second experiment startup period
startup_length = 10.0 / time_period
base_proportion = np.int(np.ceil(((time_size - startup_length)*(1-tail_proportion)) + startup_length))
addendum_proportion = time_size - base_proportion
#julia uses 1 indexing
weights_base = np.arange(base_proportion) + 1
addendum_multiplier = np.round((weight_slope * np.arange(addendum_proportion)) + 1.0)
addendum_multiplier = addendum_multiplier.astype(int)
index = base_proportion + 1
for i in addendum_multiplier:
weights_addendum = np.ones(i) * index
weights_base = np.concatenate((weights_base, weights_addendum))
index = index + 1
weights_int = weights_base.astype(int)
weights_list = weights_int.tolist()
return weights_list
The above is currently the method that is used to generate the weighting, but you can use anything, including something as simple as
weights = rh.indexweight(data, elperiods = [...], time_boundaries = [...])
and it will still throw an error for the input type of the weights. I'm not completely sure how the Integer type works in Julia, but from my understanding, it should accept anything that is an integer subtype, such as Int64, so it's not clear why my bodge works.
Thanks Fergus. There is a problem with the way the type was defined in the function. Arrays should not really be defined as we did as the type hierarchy is not passed implicitly to the containers (see below). I think I fixed the problem with the second commit. Please try it out. I don't think this had anything to do with python, it's just that you picked it up.
julia> Vector{Int64} <: Vector{Integer}
false
julia> Int64 <: Integer
true
This version fixes the earlier merge error and includes a fix for the type error encountered when passing a vector of integers to the weight input of the function.
Please check it works for you. I'd be happy to merge with the repos' dev branch afterwards.
everything appears to work as intended for me; weight arrays are accepted and timeouts work correctly. I think the best thing to do before merging would be to convert modelstepfit to the same timeout behaviour as well, as they are twin functions.
The functionality is tested (see units tests) and works fine in Julia-only world. However, yes it needs to be written in a more generic way to fully enable multiple dispatch and so work with Python, for the reason Alexandre mentioned in the code example.
If you want timeouts on modelfit
, and as you mentioned @fr293, on modelstepfit
, you might want to add them on the frequency domain fitting functions as well.
@moustachio-belvedere that's a good spot! I'll replicate the implementation of timeouts to all four fitting functions.
Thanks @moustachio-belvedere
@ab2425 mentioned that she used maxeval
rather than maxtime
to forcefully stop the optimiser.
It may be best to ensure that the process is reproducible. Otherwise different machines and different runs would give different outcomes. I'd suggest to implement both, and use maxeval
where possible.
@akabla we can implement both, but naturally only the first to be satisfied will have any effect on the computation!
From my perspective, when I'm setting a maxtime,
I'm looking at reliability of process, rather than reproducibility of outcome. I think these are slightly different use cases; I would only use the outcome of a time-out optimisation as the input for further optimisation. By construction, something has gone wrong in the optimisation of a time-out optimisation in a way that is not true for a maximum evaluation terminated optimisation.
The functionality is tested (see units tests) and works fine in Julia-only world. However, yes it needs to be written in a more generic way to fully enable multiple dispatch and so work with Python, for the reason Alexandre mentioned in the code example.
Hi @moustachio-belvedere
I'm not sure what was causing the problem, but even within Julia alone there were problems looming. I can't really understand how the Integer
works.
I think tests are OK because we cast the weights to the type Integer
:
modelout = modelfit(data0, model, stress_imposed, p0=init_params, lo=(α=0.5, β=0.2), hi=(α=1.5, β=1.5), weights=collect(Integer, 1:length(t)))
The function indexweight
also forces the type of the output to Integer
:
indices = Integer[]
That is not intuitive, and it does matter for some reason:
julia> collect(Integer, 1:3)
3-element Vector{Integer}:
1
2
3
julia> collect(1:3)
3-element Vector{Int64}:
1
2
3
Functions that take Vector{Integer}
don't accept Vector{Int64}
:
julia> f(x::Vector{Integer}) = sum(x)
f (generic function with 1 method)
julia> f([1,2])
ERROR: MethodError: no method matching f(::Vector{Int64})
Closest candidates are:
f(::Vector{Integer}) at REPL[7]:1
What is really confusing is that scalars are behaving differently.
julia> a=1::Integer
1
julia> typeof(a)
Int64
julia> g(x::Integer)=2x
g (generic function with 1 method)
julia> g(2)
4
Weird.
even within Julia alone there were problems looming
No, because the functionality is designed specifically to be used with indexweight
which returns a Vector{Integer}
, as you mentioned. The tests that do the casting are meant to test specific bits of functionality unconnected to indexweight
which is why the casts were necessary. See other tests that actually use indexweight
for comparison - those tests do not do any casting.
I agree with you that's not best practice, and the reason for that best practice has now clearly raised its head with the Python interface. As you have done, parameterising the input signature weights::Vector{T} where T<:Integer
for modelfit
is the way to go.
As for indexweight
line https://github.com/JuliaRheology/RHEOS.jl/blob/master/src/processing.jl#L88 : Integer[]
could be changed to Int[]
, or even UInt[]
- that was the behaviour I intended - to let Julia decide on the appropriate integer type for a given system. See https://docs.julialang.org/en/v1/manual/integers-and-floating-point-numbers/#Integers , with particular reference to:
Julia also defines the types Int and UInt, which are aliases for the system's signed and unsigned native integer types respectively
@fr293 regarding maxtime
vs maxeval
, bear in mind that some fractional models take an order of magnitude more time per iteration than non-fractional models, so if you are fitting both fractional and non-fractional models to data programatically, it may not be useful to specify a global maximum fitting time (though a maximum number of iterations would be in that case, edited in case it wasn't clear)
@moustachio-belvedere I'm only using a maxtime
parameter to put a hard limit on the optimisation that corresponds to how long I'm actually willing to wait for it to resolve before it moves on (safely). That timescale is determined independently of what is being optimised; it's much more to do with me than the function! It's a useful parameter to have where I am batch automating many optimisations, and one happens to have some aspects that the optimiser gets stuck on.
It's not convenient for me to estimate how many function evaluations will elapse during my 'boredom' timescale, and as you point out, that value may depend strongly on the details of the objective function.
I agree that it makes more sense to use maxeval
in the normal way you might use a termination condition, but this case is more of an automated forced stop of an anomalous process to allow the rest of the program to proceed.
Bar the immediate emergence of bugs, or any objections to my solutions, this feature extension is finished!
uh-oh...
JULIA: UndefVarError: rel_tol not defined
Stacktrace:[1] leastsquares_init(params_init::Vector{Float64}, low_bounds::Vector{Float64}, hi_bounds::Vector{Float64}, modulus::FunctionWrappers.FunctionWrapper{Vector{Float64}, Tuple{Vector{Float64}, Vector{Float64}}}, time_series::Vector{Float64}, dt::Float64, prescribed_dot::Vector{Float64}, measured::Vector{Float64}; insight::Bool, constant_sampling::Bool, singularity::Bool, _rel_tol::Nothing, _rel_tol_f::Float64, indweights::Nothing, optmethod::Symbol, opttimeout::Int64, optmaxeval::Nothing)
@akabla are you sure that this is correct for the intended behaviour?
rel_tol_f::Union{Real,Nothing} = nothing, rel_tol::Union{Real,Nothing} = isnothing(rel_tol_f) ? 1e-4 : nothing
I've corrected the typos that just emerged, but this leaves a wider issue: there's now a lot of name shadowing going on between model(step)fit and their helper functions leastsquares(step)init. This was dealt with using the underscore convention to denote the private variable name _tol_rel in the helper functions, but it was only done for that variable. I've stripped them all off until a consistent style to avoid shadowing can be decided.
Variable shadowing should not be too much of an issue since they mean the same thing and have the same value. They are simply passed from one function to the next. As you said, we did not do it for the other variables, so we'd rather avoid it for these two for now. Either way is fine though.
The code is now working, though not as expected: ftol is set as intended, but sometimes the optimiser finishes on an xtol flag when no argument is passed to the xtol kwarg, so something funny is going on.
I'm too fried to work out what this is at the moment; it's probably something trivial
Is this OK to merge? Test failure - more specifically:
https://github.com/JuliaRheology/RHEOS.jl/runs/3701445556#step:7:271
got unsupported keyword argument "rel_tol"
I know. I prefered to fix this directly on the rheos repo's dev branch. Thanks for checking!
modified the input type for weights in modelfit to accept Int64, which Python can produce. N.b. I'm unsure if this is the best way to go about fixing this, but it works for me.