JuliaStats / Klara.jl

MCMC inference in Julia
Other
167 stars 38 forks source link

DataArray not defined in SerialMC.jl #28

Closed sdwfrost closed 10 years ago

sdwfrost commented 10 years ago

DataArray isn't defined in SerialMC.jl ; a fix is to explicitly import it

import DataArrays.DataArray

and/or to add DataArrays to REQUIRE

papamarkou commented 10 years ago

Thanks for opening the issue @sdwfrost, I will look into it within the next few days and will get back to you.

eford commented 10 years ago

Also, DataArrays is not automatically installed by running Pkg.add("MCMC")

johnmyleswhite commented 10 years ago

I just added MCMC to the REQUIRE file. I don't understand the codebase well enough to make other fixes.

papamarkou commented 10 years ago

John, I have been debating with myself about this; what is better practice, should I be using data arrays or more generic float64 arrays in various places as function input arguments in the package?

johnmyleswhite commented 10 years ago

Do you want to support missing data in inputs? If so, I'd use DataArrays. You can always convert them to Array{Float64} behind the scenes using array(data, NaN) to substitute NaN for NA.

papamarkou commented 10 years ago

I always consider NA's evil when working with MCMC... I hope not because I am prejudiced, but mostly because of the statistical dependence between successive points in a Monte Carlo chain, meaning that an NA occurrence ruins all the chain. From that point of view it may be better to stick with float64 arrays. If someone in the Monte Carlo or statistical community objects my thinking, please speak up.

johnmyleswhite commented 10 years ago

I'm confused: in most MCMC systems like Jags that I've used in the past, the system will impute the values of missing entries based on their local density function. Then the system will continually resample the imputations based on all of the other nodes in the graphical model.

papamarkou commented 10 years ago

Yes, imputation is a possibility, yet it is sth that we haven't implemented in our MCMC package. Your suggestion makes sense to me. Would you agree that it would be more reasonable not to allow for NA's in MCMC chains until we support imputation of NA values? I have never tried to implement the idea in MCMC context (only in statistical genetics but this is a completely different area), so I don't have the knowledge to add this functionality to our package at the moment without reading first.

johnmyleswhite commented 10 years ago

Yes, I agree that, if you're not imputing missing data, then it's reasonable to not allow it as input.

eford commented 10 years ago

Hi folks,

FWIW- If users provide their own likelihood function, then they may be able to deal with NA's on their own, even if the MCMC package's model doesn't know how yet. Further, it seems like something that'll be added in the future. So my suggestion would be to allow NA's for input data, but that MCMC outputs should not have NA's.

Cheers, Eric

On Sun, Jan 12, 2014 at 7:04 PM, John Myles White notifications@github.comwrote:

Yes, I agree that, if you're not imputing missing data, then it's reasonable to not allow it as input.

— Reply to this email directly or view it on GitHubhttps://github.com/JuliaStats/MCMC.jl/issues/28#issuecomment-32128691 .

Eric Ford Professor of Astronomy & Astrophysics Center for Exoplanets & Habitable Worlds Center for Astrostatistics Institute for CyberScience Pennsylvania State University

johnmyleswhite commented 10 years ago

The complication with letting users handle NA's is that the package probably wants to insist that all computations are done using Array{Float64} rather than DataArray{Float64}. For example, the Optim package requires that the user-specified objective function be a function of the form f: Vector{Float64} -> Float64. So NA's need to be handled with some caution when allowed as inputs.

papamarkou commented 10 years ago

Since it has been agreed not to allow NA's in MCMC (at least not at this stage), I will close this old ticket for housekeeping purposes.