hmsc-r / HMSC

GNU General Public License v3.0
103 stars 37 forks source link

Interpretation of `predictEtaMean` / `predictEtaMeanField` arguments of the predict function #185

Open elgabbas opened 6 months ago

elgabbas commented 6 months ago

Hello,

I would like to make spatial predictions for an Hmsc model with a spatial random effect. I have two questions here:


  1. The predict function is very slow. In a toy model with only c.a. 5000 sampling sites and 22 species, the predict function took 25 minutes to run. The final model will be more complex: c.a. 50K sampling sites and many more species.
    # add some noise to coordinates to be able to use the predict function
    require(dplyr)
    DT_xy2 <- (DT_xy + runif(nrow(DT_xy) * 2, 0.05, max = 0.1)) %>% 
    matrix(ncol = 2)
    Gradient = prepareGradient(m, XDataNew = XData.grid, sDataNew = list(sample =DT_xy2 ))
    predY = predict(m, Gradient = Gradient)

    I tried using nParallel = nChains argument but this did not improve much the processing speed. I tried to increase the speed by making small changes to the predict.Hmsc and predictLatentFactor functions to process both functions in parallel. There is some speed improvement, but if predictEtaMean = TRUE or predictEtaMeanField = TRUE, no improvement was found.

Is it possible to increase the speed of model predictions?


  1. .
# default: predictEtaMean & predictEtaMeanField are FALSE
predict(m, Gradient, expected = FALSE, predictEtaMean = FALSE, predictEtaMeanField = FALSE)
# predictEtaMean = TRUE
predict(m, Gradient, expected = FALSE, predictEtaMean = TRUE, predictEtaMeanField = FALSE)
# predictEtaMeanField = TRUE
predict(m, Gradient, expected = FALSE, predictEtaMean = FALSE, predictEtaMeanField = TRUE)

Could you please explain, in less technical terms, what is the difference between the values of the previous three predictions and which should be used? Is there an ecological difference in their interpretation? Using predictEtaMean = TRUE or predictEtaMeanField = TRUE makes the prediction take much time.

Thanks, Ahmed

claraqin commented 1 week ago

Hi Ahmed,

I'm also interested in this question because of the amount of time it takes to make spatial predictions. I was under the impression that predictEtaMean = TRUE would speed up prediction as well, so I was also surprised when it slowed it down instead. Have you learned anything since posting this?

Thanks, Clara

elgabbas commented 1 week ago

Hello Clara, I am interested in fitting large-scale models using GPP but have also fitted smaller-scale models for exploration. In relatively small-scale models, using predictEtaMean = TRUE increased the prediction speed in my case by 3 times. However, this did not help much when using large-scale models when I predict to new sites [see here]. When predictEtaMean = TRUE, the predictlatentfactor function calculates the distances between all sampling units and between sampling units and new sites and do further analyses on them. These two matrices can be very large in large-scale studies and this is probably the reason for the problem in my case. Using predictEtaMean = FALSE did not help in my case.

Ahmed