Closed FelixNoessler closed 8 months ago
The docs above refer to callers of the interface. The thread safety of your log density implementation is orthogonal to this.
That said, if you want to use pre-allocated buffers insider your log density implementation, you could
struct
that has a buffers,ADgradient
with a deep copy on this for each thread, so that each gets an independent structure.But note that for larger dimensions, reverse mode (eg Enzyme) is much more efficient and has no similar issues.
thanks for your answer!
I have a function that preallocated all vectors, and then I create buffers for ForwardDiff and for nomal evaluations:
function preallocate(; Tdiff)
normal = preallocate_vectors(; T = Float64)
diff = preallocate_vectors(; T = Tdiff)
return (; normal, diff)
end
For ForwardDiff I use the type of the gradient config (this was the missing step!):
∇ℓ = ADgradient(:ForwardDiff, ℓ; chunk = 5, x = init_x)
# Preallocation
prealloc = NamedTuple[]
for _ in 1:Threads.nthreads()
c = preallocate(; Tdiff = eltype(∇ℓ.gradient_config))
push!(prealloc, c)
end
and then I have a function to get the buffer depending on the type T
of the parameter:
function get_buffer(buffer, T)
if T <: ForwardDiff.Dual
return buffer.diff
elseif T <: Float64
return buffer.normal
end
end
and my loglikelihood function looks like this:
function my_loglik(θ)
θ_type = eltype(θ)
loglik_vec = Array{θ_type}(undef, length(training_plots))
Threads.@threads for i in eachindex(training_plots)
loglik_vec[i] = loglikelihood_model(
plotID = training_plots[i],
θ, buffer = get_buffer(prealloc[Threads.threadid()], θ_type))
end
return sum(loglik_vec)
end
Is this general approach good?
I have a follow up question that is not related to the LogDensityProblemsAD.jl package: Is ok to just use Threads.threadid()
to index the buffer? The loglikelihood evaluations are not dependent on each other, so the order of the evaluations does not matter. I get always the same results and the result matches the serial implementation, but in the documentations it is written that user should avoid using Threads.threadid
to index buffers. Sorry for asking the question here.
https://docs.julialang.org/en/v1/manual/multi-threading/#man-task-migration
After a task starts running on a certain thread it may move to a different thread if the task yields.
...
This means that in most cases threadid() should not be treated as constant within a task, and therefore should not be used to index into a vector of buffers or stateful objects.
Hello,
thanks for creating
LogDensityProblems.jl
!I saw the issue #6 that the automatic differentiation is not thread safe. In my implementation, it seems that it is now thread safe to use
ADgradient
with threads inside the log density (likelihood+prior) calculation. I have to run my simulation model for several sites with the same parameter set and calculate the total likelihood. However, I then get many allocations and the garbage collection time is above 80 %.Usually I preallocate many vectors in a named tuple to avoid allocations. Of course, if I just use my named tuple, I get an error from ForwardDiff.jl. I tried GeneralLazyBufferCache from PreallocationTools.jl to wrap my named tuple. However, this seems to be not thread safe. I used the old way of using Threads with threadid() to get the preallocated named tuple for thread. Maybe also here is the problem.
I am a bit confused. I think it would be nice to have a tutorial that shows how to use LogDensityProblems with preallocation (with PreallocationTools.jl?) and threads (maybe with FLoops.jl?).
Is the documentation up to date? There it is written that I should copy the gradient. However, I want to use threads in side the calculation of the likelihood, so I am not sure if this makes sense for me. The documentation says:
Best Felix