famuvie / breedR

Statistical methods for forest genetic resources analysts
http://famuvie.github.io/breedR/
GNU General Public License v3.0
31 stars 24 forks source link

Spatial analysis with breedR #107

Closed Shuyi1991 closed 3 years ago

Shuyi1991 commented 3 years ago

Hi, I'm recently running codes for spatial analysis with breedR package. Data and codes are both from workshops of breedR. Data: DW.predicted.practice.txt Codes: mixmod_breedR_AR1_bloc_grid0 <- remlf90(fixed = DW.predicted.model1.trsf ~ 1 + as.factor(Bloc), random = ~ Ident, spatial = list(model = "AR", coordinates = data_ok[, c("X_ok", "Y_ok")]), data = data_ok, method = "ai") However, it takes quite a long time. I'm running the codes for 5 hours but it still haven't finish. I'm wondering if it need more time to get the results?

Zoe

famuvie commented 3 years ago

Hi Zoe,

difficult to tell with such scarce information. Which data and which code are you using?

Spatial model "AR" without explicit auto-correlation values rho will fit by default 16 models with different combination of row and column spatial auto-correlation rho. So, it is expected to take longer. 5 hours might be too much or not, depending on the size of the data and the spatial configuration.

You can try setting rho to some fixed values to see whether it finishes in a more reasonable time. E.g.

spatial = list(   model = "AR",   coordinates = data_ok[, c("X_ok", "Y_ok")],   rho = c(.8, .8) )

Also, method "ai" is typically faster, but depending on the initial values it can diverge and take a long time. Try using method = "em" and see whether it runs faster.

Hope it helps

ƒacu.-

On 18/06/2021 13:35, Shuyi1991 wrote:

Hi, I'm recently running codes for spatial analysis with breedR package. Data and codes are both from workshops of breedR. Data: DW.predicted.practice.txt Codes: mixmod_breedR_AR1_bloc_grid0 <- remlf90(fixed = DW.predicted.model1.trsf ~ 1 + as.factor(Bloc), random = ~ Ident, spatial = list(model = "AR", coordinates = data_ok[, c("X_ok", "Y_ok")]), data = data_ok, method = "ai") However, it takes quite a long time. I'm running the codes for 5 hours but it still haven't finish. I'm wondering if it need more time to get the results?

Zoe

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/famuvie/breedR/issues/107, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLOEDVHWWEEQRXF5CEJGODTTMVRZANCNFSM465QEUWA.

Shuyi1991 commented 3 years ago

@famuvie Thanks so much for your help. After about 6 hours running, I got the results. Could you please help with further questions?

(1) how can we set the best "rho" value? i.e. when we are doing spatial analysis, the important step is to find the best rho value for model?

(2) I added codes from one of your workshops for breedR several years ago (I found these from the website of tree4future). However, when I ran some of them, R give error message, e.g. "BLUP_genot" & "spatial". Do you keep any information/tutors for spatial analysis using breedR? Or is there any other website I could find detailed information about spatial analysis?

(3) I read some materials from your workshop. After spatial analysis, you usually estimated genetic parameters. I work on tree breeding and want to find best performance progenies. I'm wondering if you used adjusted growth traits (e.g. height & diameter) at individual tree level by spatial effect to find best performance ones?

codes from workshop:

mixed model with a spatial effect with the best autoregressive parameters for rows and columns: autoregressive with block effect

selmod <- remlf90(fixed = DW.predicted.model1.trsf ~ 1 + as.factor(Bloc), random = ~ Ident, spatial = list(model = "AR", coordinates = data_ok[, c("X_ok", "Y_ok")], rho = c(rho_r= 0.846, rho_c = 0.976)), data = data_ok, method = "ai")

save(selmod, file = "selmod_DW.predicted_model1_trsf.Rdata") # load("selmod_DW.predicted_model1_trsf.Rdata") summary(selmod)

Part 5: Focus on selected spatial model "selmod"

extraction of predicted values for fixed and random effects

BLUP_genot <- data.frame("Ident" = rownames(selmod$ranef$Ident), GBLUP = selmod$ranef$Ident$value)

Output <- data.frame(na.omit(data_ok, "Coord" = paste(data_ok$X_ok, data_ok$Y_ok, sep = "-")))

Output$Coord <- paste(Output$X_ok, Output$Y_ok, sep = "-")

Block_effect <- as.vector(sapply(selmod$fixed, function(x){x$value})) Output$Block_effect <- Block_effect[1]

for (i in 2:length(Block_effect)){ Output$Block_effect[Output$Bloc == i] <- Block_effect[i] }

Output <- merge(Output, BLUP_genot, by = "Ident")

spatial <- data.frame(selmod$effects$spatial$effects$ar$coordinates, Spatial = selmod$effects$spatial$effects$ar$incidence.matrix %*% ranef(selmod)$spatial)

Output <- merge(Output, data.frame("Coord" = paste(spatial$X_ok, spatial$Y_ok, sep = "-"), "Spatial" = spatial$Spatial), by = "Coord")

Output$Fitted <- Output$Block_effect + Output$GBLUP + Output$Spatial

resid$Selmod <- na.omit(residuals(selmod))

Output <- merge(Output, data.frame("Coord" = paste(resid$X_ok, resid$Y_ok, sep = "-"), "Resid" = resid$Selmod), by = "Coord")

Output$Phen_Adj <- mean(Output$DW.predicted.model1.trsf) + Output$GBLUP + Output$Resid

famuvie commented 3 years ago

Dear Zoe,

  1. The "best" values for rho is something with fuzzy meaning. Actually, it depends on your goal. What breedR does is to fit a few combinations and pick the best in terms of likelihood, or equivalently, the AIC (i.e. predictive performance of observed traits). See the tutorial [1].

If your interest is mainly on the genetic parameters (e.g. breeding values), then the precise values of rho typically don't have a lot of impact.

  1. The available tutorials are all linked from the website [2]. If you have a specific problem, please open a new issue [3] with a reproducible minimal example.

  2. If you use a spatial effect in your model, the estimates of the genetic parameters are already "adjusted" by spatial effect. You don't need to adjust your observations. Otherwise you are effectively "removing" the environmental effect beforehand. It is better to use a proper model with the measured traits.

Please, in the future use the discussion group for general questions like this. The GitHub issues is more appropriate for tracking specific problems that need to be tackled. Thank you.

ƒacu.-

[1] https://github.com/famuvie/breedR/wiki/Overview#exercise--tuning-spatial-parameters

[2] https://famuvie.github.io/breedR/

[3] https://github.com/famuvie/breedR/issues

[4] http://groups.google.com/group/breedr