TuringLang / MCMCChains.jl

Types and utility functions for summarizing Markov chain Monte Carlo simulations
https://turinglang.org/MCMCChains.jl/
Other
266 stars 29 forks source link

chainscat error in combining single chains #427

Open bgctw opened 1 year ago

bgctw commented 1 year ago

When trying to combine several chains I get the following error:

command: chainscat(chns[1], chns[2])

ERROR: MethodError: Cannot `convert` an object of type 
  MCMCChains.Chains{Float64{},AxisArrays.AxisArray{Float64{},3,Array{Float64{},3},Tuple{AxisArrays.Axis{iter,UnitRange{Int64{}}},AxisArrays.Axis{var,Array{Symbol{},1}},AxisArrays.Axis{chain,UnitRange{Int64}}}},Missing{},NamedTuple{(:parameters, :internals, :u0),Tuple{Array{Symbol{},1},Array{Symbol{},1},Array{Symbol{},1}}},NamedTuple{(:start_time, :stop_time),Tuple{Array{Float64{},1},Array{Float64{},1}}}} to an object of type 
  MCMCChains.Chains{Float64{},AxisArrays.AxisArray{Float64{},3,Array{Float64{},3},Tuple{AxisArrays.Axis{iter,UnitRange{Int64{}}},AxisArrays.Axis{var,Array{Symbol{},1}},AxisArrays.Axis{chain,Vector{Int64}}}},Missing{},NamedTuple{(:parameters, :internals, :u0),Tuple{Array{Symbol{},1},Array{Symbol{},1},Array{Symbol{},1}}},NamedTuple{(:start_time, :stop_time),Tuple{Array{Float64{},1},Array{Float64{},1}}}}

Closest candidates are:
  convert(::Type{T}, ::T) where T
   @ Base Base.jl:64
  (::Type{MCMCChains.Chains{T, A, L, K, I}} where {T, A<:(AxisArrays.AxisArray{T, 3}), L, K<:NamedTuple, I<:NamedTuple})(::Any, ::Any, ::Any, ::Any)
   @ MCMCChains ~/scratch/twutz/julia_cluster_depots/packages/MCMCChains/OVsxE/src/MCMCChains.jl:64

The problem is the signature AxisArrays.Axis{chain,UnitRange{Int64}} in the chain dimension in the signature of the single-chain objects. There need to be some conversion method to AxisArrays.Axis{chain,Vector{Int64}}.

Background: I sample several chain on several jobs submitted to a clusters. Compared to a single multi-CPU sampling jobs, this avoids blocking processors for the common case that most chains finished, but a single chain is still sampling for hours. But then I need to combine the several single-chain MCMCChains objects.

cpfiffer commented 1 year ago

Hmmmm you're right -- this should probably be fixed. Could you provide a MWE and/or the full error message (with line numbers)?

bgctw commented 1 year ago

After updating to Julia 1.9.2 and associated updating all packages including Turing and MCMCChains, the error does not occur any more. A similar error occurs (not in chainscat anymore), but when replacing a DataFrameRow column. This error is fixed by first converting DataFrameRow object to DataFrame, so that the type of column is allowed to change.

I put the full error, in case it helps to resolve similar problems

julia> sample2 = load_sample_crossfiles(slc)
ERROR: MethodError: no method matching setindex!(::DataFrameRow{DataFrame, DataFrames.Index}, ::MCMCChains.Chains{Float64, AxisArrays.AxisArray{Float64, 3, Array{Float64, 3}, Tuple{AxisArrays.Axis{:iter, UnitRange{Int64}}, AxisArrays.Axis{:var, Vector{Symbol}}, AxisArrays.Axis{:chain, UnitRange{Int64}}}}, Missing, NamedTuple{(:parameters, :internals, :u0), Tuple{Vector{Symbol}, Vector{Symbol}, Vector{Symbol}}}, NamedTuple{(:start_time, :stop_time), Tuple{Vector{Float64}, Vector{Float64}}}}, ::typeof(!), ::Symbol)

Closest candidates are:
  setindex!(::DataFrameRow, ::Any, ::Any)
   @ DataFrames ~/scratch/twutz/julia_cluster_depots/packages/DataFrames/BYWoC/src/dataframerow/dataframerow.jl:262

Stacktrace:
 [1] load_sample_crossfiles(::NamedTuple{(:site, :targetlim, :scenario), Tuple{Symbol, Symbol, Tuple{Symbol}}})
   @ SesamFitSPP /Net/Groups/BGI/people/twutz/projects_nosync/SesamFitSPP/src/persist_samples.jl:164
 [2] top-level scope
   @ REPL[21]:1
 [3] top-level scope
   @ ~/scratch/twutz/julia_cluster_depots/packages/Infiltrator/LtFao/src/Infiltrator.jl:726

From

function load_sample_crossfiles(slc, kwargs...)
    path = get_sample_filename(slc; i_chain = "[0-9]+", kwargs...)
    dir, fname = splitdir(path)
    paths = joinpath.(dir, filter(x -> occursin(Regex(fname),x),readdir(dir)))
    # need to convert ot DataFrame in order to replace a column
    # instead of only values inside the column
    dfs = map(fname -> load_sample(slc;fname), paths);
    df1 = first(dfs)
    df1[!, :chn] = chainscat(map(df -> df.chn, dfs)...)
    # df1 = copy(DataFrame(first(dfs)))
    # df1[!, :chn] = [chainscat(map(df -> df.chn, dfs)...)]
    # df1[!, :lp95] = [max(map(df -> df.lp95, dfs)...)]
    # df1[1,:]
end