Open carstenbauer opened 5 years ago
Yes, currently only Float64 values are accepted. It should be straight forward to extend this to integers, complex numbers and arrays for the binning part. I'm not sure how to handle the variance etc for complex numbers and arrays though.
The variance for an array is defined element wise.
For complex numbers you can go with Julia and define var(z) = var(real(z)) + var(imag(z))
. (That's also what I do in MonteCarloObservable.jl)
julia> using Statistics
julia> z = rand(ComplexF64, 100);
julia> var(z)
0.18059172464677126
julia> var(real(z)) + var(imag(z))
0.18059172464677137
Both should be working now. Though for Arrays, you currently need to apply a "zero". For example, if you want to do a Binning Analysis with 3 component vectors, you need to supply a [0.0, 0.0, 0.0]
to the Binning Analysis.
Restricting the binning analysis to static arrays would remove the need for this, but I'm not sure if this is a good idea for usability.
Great! There are a couple of things probably left to do
Int64
s to a Float64
binner.std_error(B)
for a BinnerA
with multidimensional data doesn't seem to be defined (probably just a missing method)std_error(B)
estimation, i.e. picking the standard error of the highest level, is problematic as statistical fluctuations will dominate in the large bin size limit (due to less and less bins). There are a couple of heuristics that one could use, for example, to only consider bin sizes for which we have at least, say, 50 bins.Are you at the university today? Maybe we can chat for 10 minutes. I'll arrive in about 20 minutes.
If data pushed to the binning analysis varies in type, does that not indicate some issue with surrounding code? (type instability or logical error)
It was already defined, but I forgot to add a default value for the binning level. Should be working now.
That's true, and I initially limited the final level of the binning tree to have at least M values. But I think it's better to have the binning analysis generate every level. That way the user can check/decide how many levels should be ignored. Reducing the data accordingly is easy - you just ignore the last few binning levels. However, if you impose a limit beforehand you have to do extra work if you ever wanted to check the fluctuations on the final levels.
A more gentle approach would be to limit the output of all_x
methods to include at least M bins for their final level. I.e. change
function all_vars(B::BinnerA{N}) where {N}
[var(B, lvl) for lvl in 0:N-1 if B.count[lvl+1] > 0]
end
to
function all_vars(B::BinnerA{N}, min_bins=50) where {N}
[var(B, lvl) for lvl in 0:N-1 if B.count[lvl+1] >= min_bins]
end
If data pushed to the binning analysis varies in type, does that not indicate some issue with surrounding code? (type instability or logical error)
It might. But why not make it possible? It doesn't hurt, does it? It's at least convenient for interactive usage.
But I think it's better to have the binning analysis generate every level.
I think it's good that all levels are processed, and I'm also fine with all_*
methods to show all the information, but std_error
should give a reasonable error and the last one isn't reliable.
AFAICS, currently only Float64 numbers can be handled.
It would be good to have support for other number types (like ComplexF64) and also higher dimensional arrays (numbers being the 0 dimensional case).