Closed amytims closed 8 months ago
Thanks for sharing this!
I see the problem. Is this what you're suggesting then?
prep_icar_data <- function (C, inv_sqrt_scale_factor = NULL) {
n <- nrow(C)
E <- edges(C)
G <- list(np = nrow(C), from = E$node1, to = E$node2, nedges = nrow(E))
class(G) <- "Graph"
nb2 <- spdep::n.comp.nb(spdep::graph2nb(G))
k = nb2$nc
if (inherits(inv_sqrt_scale_factor, "NULL")) inv_sqrt_scale_factor <- array(rep(1, k), dim = k)
group_idx = NULL
for (j in 1:k) group_idx <- c(group_idx, which(nb2$comp.id == j))
group_size <- NULL
for (j in 1:k) group_size <- c(group_size, sum(nb2$comp.id == j))
# intercept per connected component of size > 1, if multiple.
m <- sum(group_size > 1) - 1
if (m) {
############################## changes here
GS <- group_size
ID <- nb2$comp.id
# identify the first non-singleton part of the graph
main.group <- ID[which(rowSums(as.matrix(C))!=0)[1]]
# lump singletons together with main.group
change.group <- which(GS == 1)
ID[which(ID %in% change.group)] <- main.group
# create dummy variables for additional parts of the graph
A = model.matrix(~ factor(ID))
# drop the constant term
A <- as.matrix(A[,-1])
##############################
} else {
A <- model.matrix(~ 0, data.frame(C))
}
l <- list(k = k,
group_size = array(group_size, dim = k),
n_edges = nrow(E),
node1 = E$node1,
node2 = E$node2,
group_idx = array(group_idx, dim = n),
m = m,
A = A,
inv_sqrt_scale_factor = inv_sqrt_scale_factor,
comp_id = nb2$comp.id)
return(l)
}
I'm going to update the same function in the geostan R package too. Are you interested in submitting a pull request there (so that your contribution is recorded)? https://github.com/ConnorDonegan/geostan/blob/main/R/convenience-functions.R
Hi Connor,
Yeah, I think that'll work :) I'll submit the pull request now.
Thanks so much for all your work on this. I'm not great with stan syntax so the R helper functions have been a lifesaver for me!
Cheers, Amy
I'm glad its useful for you! I updated the function in icar-functions.R
here, and your pull request will be released with geostan 0.6.0, maybe within a few weeks. Thanks again for your contribution!
I've been trying to run some spatial models with multiple singletons in the matrix, and found that the
stan_icar()
function was giving me errors. I think I've traced the problem to theprep_icar_data()
function, which sometimes returns matrices with unexpected dimensions, throwing amismatch in dimension declared and found
error whenstan_icar()
is run.The source of the problem is in the
prep_icar_data()
function at the lineID[which(ID == change.to.one)] <- 1
. If you have more than one singleton in your connectivity matrix (i.e., ifchange.to.one
has length > 1), it doesn't change the values inID
properly, so the next couple of lines where the matrix is supposed to have singletons removed don't function as intended:A = model.matrix(~factor(ID))
A <- as.matrix(A[, -1])
When this happens inside the
icar_stan()
function, it throws the following error and warning message:Error : Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=A; position=1; ...
In addition: Warning message:
In ID == change.to.one :
longer object length is not a multiple of shorter object length
I think changing to
ID[which(ID %in% change.to.one)] <- 1
would fix this so all singletons get changed to 1.Also, I noticed that these lines in
prep_icar_data()
behave differently depending on whether your first group is a singleton or not.ID[which(ID == change.to.one)] <- 1
A = model.matrix(~factor(ID))
A <- as.matrix(A[, -1])
If it isn't, the code removes the first set of connected points as well as all the singletons that get lumped into group 1, giving
A
a size ofn x m
, wheren = nrow(C)
andm = sum(group_size > 1) - 1
, as defined within the function. However, if the first group is just a singleton, only the singletons are removed and all larger groups remain, givingA
a size ofn x m+1
, and triggering the dimension mismatch error inicar_stan()
:Error : Exception: mismatch in dimension declared and found in context; processing stage=data initialization; variable name=A; position=1; dims declared=(n,m); dims found=(n,m+1) (in 'string', line 480, column 2 to column 17)
failed to create the sampler; sampling not done
You can find the group number of the first non-singleton group with
group <- ID[which(rowSums(as.matrix(C))!=0)[1]]
and change the singletons to match that group number instead of defaulting to 1 with
ID[which(ID %in% change.to.one)] <- group
Then when you run
A = model.matrix(~factor(ID))
A <- as.matrix(A[, -1])
it'll remove the first column with all singletons and the members of the first cluster, which will give A the right dimensions and avoid triggering the dimension mismatch error