hmsc-r / HMSC

GNU General Public License v3.0
102 stars 37 forks source link

All 0's in Alpha posterior? #146

Open abigailcohen opened 2 years ago

abigailcohen commented 2 years ago

Hello,

I am attempting to predict distribution of a single species over a grid using the real-life case study in Ch.5 of the JDSM textbook as a template. This is currently more of a proof-of-concept, so I use one species at a time and use a linear model where habitat is the only predictive variable. However, when I use sampleMCMC to draw samples from the distribution, every single sample of the Alpha parameter is 0. I have tried the same model setup with a very common beetle and a less common bee species, so I don't think it's that the data itself has no spatial effect- and getting all 0's seems odd. I have tried centring the UTM coordinates (subtracted mean and divided by 1000) to reduce the integer size, and using more chains, but neither had any effect.

Here is the head of the bee data to give a sense of what it looks like.

site X Y X.c Y.c hab B_nevadensis
10092 -1229321 1337016 -13.509709 -2.408190 Developed 1
10212 -1238714 1348803 -22.902527 9.379033 Developed 3
10348 -1258299 1350211 -42.487295 10.786970 Developed 0
10495 -1198551 1326071 17.260420 -13.352796 Developed 5
10558 -1219215 1330417 -3.403721 -9.007228 Crop 9
10781 -1203414 1319531 12.396961 -19.892887 Crop 9

The model is constructed like so: XData = data.frame(neva.example$hab) #predictor variable Y = as.matrix(neva.c$B_nevadensis) #response variable colnames(Y) = "B_nevadensis" xy.c = as.matrix(cbind(neva.c$X.c, neva.c$Y.c)) #coordinates site = as.factor(neva.c$BLID.x) studyDesign = data.frame(site = site) rownames(xy.c) = (studyDesign[,1]) #assign each site the corresponding coordinates rL.c = HmscRandomLevel(sData = xy.c) # create random spatial effect XFormula = ~ hab # linear regression formula for fixed effects

mFULL = Hmsc(Y=Y, XData = XData, XFormula = ~ hab, distr = "lognormal poisson", studyDesign = studyDesign, ranLevels = list("site"=rL.c)) #full model with fixed and random effects

Hmsc object with 156 sampling units, 1 species, 8 covariates, 1 traits and 1 random levels Posterior MCMC sampling with 4 chains each with 250 samples, thin 100 and transient 12500

I have also attached the summary plot of Alpha. image

ovaskain commented 2 years ago

Hi,

It is quite common to have the full posterior concentrated to alpha=0 (note that half of the default prior mass is set to alpha=0). So the first guess is that the residual does not show spatial signal, i.e. “hab” explains all of that. You may try a “space only model” with XFormula = ~1 to see if the model picks up a spatial signal from the raw data (rather than from the residual).

O2

From: abigailcohen @.> Sent: tiistai 19. heinäkuuta 2022 20:14 To: hmsc-r/HMSC @.> Cc: Subscribed @.***> Subject: [hmsc-r/HMSC] All 0's in Alpha posterior? (Issue #146)

Hello,

I am attempting to predict distribution of a single species over a grid using the real-life case study in Ch.5 of the JDSM textbook as a template. This is currently more of a proof-of-concept, so I use one species at a time and use a linear model where habitat is the only predictive variable. However, when I use sampleMCMC to draw samples from the distribution, every single sample of the Alpha parameter is 0. I have tried the same model setup with a very common beetle and a less common bee species, so I don't think it's that the data itself has no spatial effect- and getting all 0's seems odd. I have tried centring the UTM coordinates (subtracted mean and divided by 1000) to reduce the integer size, and using more chains, but neither had any effect.

Here is the head of the bee data to give a sense of what it looks like. site X Y X.c Y.c hab B_nevadensis 10092 -1229321 1337016 -13.509709 -2.408190 Developed 1 10212 -1238714 1348803 -22.902527 9.379033 Developed 3 10348 -1258299 1350211 -42.487295 10.786970 Developed 0 10495 -1198551 1326071 17.260420 -13.352796 Developed 5 10558 -1219215 1330417 -3.403721 -9.007228 Crop 9 10781 -1203414 1319531 12.396961 -19.892887 Crop 9

The model is constructed like so: XData = data.frame(neva.example$hab) #predictor variable Y = as.matrix(neva.c$B_nevadensis) #response variable colnames(Y) = "B_nevadensis" xy.c = as.matrix(cbind(neva.c$X.c, neva.c$Y.c)) #coordinates site = as.factor(neva.c$BLID.x) studyDesign = data.frame(site = site) rownames(xy.c) = (studyDesign[,1]) #assign each site the corresponding coordinates rL.c = HmscRandomLevel(sData = xy.c) # create random spatial effect XFormula = ~ hab # linear regression formula for fixed effects

mFULL = Hmsc(Y=Y, XData = XData, XFormula = ~ hab, distr = "lognormal poisson", studyDesign = studyDesign, ranLevels = list("site"=rL.c)) #full model with fixed and random effects

Hmsc object with 156 sampling units, 1 species, 8 covariates, 1 traits and 1 random levels Posterior MCMC sampling with 4 chains each with 250 samples, thin 100 and transient 12500

I have also attached the summary plot of Alpha. [image]https://user-images.githubusercontent.com/109101526/179803381-11e09557-6acf-476d-b3a4-a03a10d68a4e.png

— Reply to this email directly, view it on GitHubhttps://github.com/hmsc-r/HMSC/issues/146, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEIYMZV6UMZMKEUOAN5YK2TVU3O5TANCNFSM54AW4WBA. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>

abigailcohen commented 2 years ago

Hi, It is quite common to have the full posterior concentrated to alpha=0 (note that half of the default prior mass is set to alpha=0). So the first guess is that the residual does not show spatial signal, i.e. “hab” explains all of that. You may try a “space only model” with XFormula = ~1 to see if the model picks up a spatial signal from the raw data (rather than from the residual).

When I first ran this model with the more common beetle ssp, I ran a space-only and environment only model as well. The space-only model has the same all-0 Alpha posteriors, as seen below:

mSPACE = Hmsc(Y=Y, XData=XData, XFormula = ~1, distr = "lognormal poisson", studyDesign = studyDesign, ranLevels = list(route=rL))

Hmsc object with 115 sampling units, 1 species, 1 covariates, 1 traits and 1 random levels Posterior MCMC sampling with 2 chains each with 1000 samples, thin 5 and transient 2500

image

ovaskain commented 2 years ago

Hi again,

I would think that either to data do not have spatial signal, or then there is some issue with the lognormal poisson model (which is harder to fit and more prone to issues such as outliers than e.g. probit model). If a good proportion of the samples are zeros, you could also try space-only model for 0/1 data with probit link. If there is spatial signal (which you will see basically by plotting the 0/1 data on a map), the model should pick it up.

Otso

From: abigailcohen @.> Sent: tiistai 19. heinäkuuta 2022 22:56 To: hmsc-r/HMSC @.> Cc: Ovaskainen, Otso T @.>; Comment @.> Subject: Re: [hmsc-r/HMSC] All 0's in Alpha posterior? (Issue #146)

Hi, It is quite common to have the full posterior concentrated to alpha=0 (note that half of the default prior mass is set to alpha=0). So the first guess is that the residual does not show spatial signal, i.e. “hab” explains all of that. You may try a “space only model” with XFormula = ~1 to see if the model picks up a spatial signal from the raw data (rather than from the residual).

When I first ran this model with the more common beetle ssp, I ran a space-only and environment only model as well. The space-only model has the same all-0 Alpha posteriors, as seen below:

mSPACE = Hmsc(Y=Y, XData=XData, XFormula = ~1, distr = "lognormal poisson", studyDesign = studyDesign, ranLevels = list(route=rL))

Hmsc object with 115 sampling units, 1 species, 1 covariates, 1 traits and 1 random levels Posterior MCMC sampling with 2 chains each with 1000 samples, thin 5 and transient 2500

[image]https://user-images.githubusercontent.com/109101526/179836558-4529debd-0c2e-4250-9e98-df3dc613ae57.png

— Reply to this email directly, view it on GitHubhttps://github.com/hmsc-r/HMSC/issues/146#issuecomment-1189493036, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEIYMZXB5EWLS7CWNAY5Z6TVU4B37ANCNFSM54AW4WBA. You are receiving this because you commented.Message ID: @.**@.>>

abigailcohen commented 2 years ago

Hi , thank you for the suggestion.

I tried a full/env/space set of models with a probit model, but got the exact same all-0 Alpha result for both again. However, since I have occurrence in 100/156 sites, and it seems like those 0's occur in each cluster of sites, I don't think it is the case that there is no spatial signal.

As you can see from the plot, while occurrence is mostly 0 in the N/NW area that is mostly Woods habitat, there are also 0's in the central Crop/Developed area, so it seems like habitat should not be entirely correlated with spatial effect. image

ovaskain commented 2 years ago

Hi,

As you said, the absences are mostly in the N/NW part of the study area, and hence there seems to be clear spatial signal in the data. Thus it sounds a bit surprising if the XFormula = ~1 probit model for 0/1 data returns alpha=0, as that model should be the one that most easily reports the spatial signal (there are no confounding factors there).

O2

From: abigailcohen @.> Sent: torstai 21. heinäkuuta 2022 20:35 To: hmsc-r/HMSC @.> Cc: Ovaskainen, Otso T @.>; Comment @.> Subject: Re: [hmsc-r/HMSC] All 0's in Alpha posterior? (Issue #146)

Hi , thank you for the suggestion.

I tried a full/env/space set of models with a probit model, but got the exact same all-0 Alpha result for both again. However, since I have occurrence in 100/156 sites, and it seems like those 0's occur in each cluster of sites, I don't think it is the case that there is no spatial signal.

As you can see from the plot, while occurrence is mostly 0 in the N/NW area that is mostly Woods habitat, there are also 0's in the central Crop/Developed area, so it seems like habitat should not be entirely correlated with spatial effect. [image]https://user-images.githubusercontent.com/109101526/180276591-67dde604-090e-4616-a09b-d7057e8799c2.png

— Reply to this email directly, view it on GitHubhttps://github.com/hmsc-r/HMSC/issues/146#issuecomment-1191757892, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEIYMZXIZFT3DXRFDYKW5STVVGC43ANCNFSM54AW4WBA. You are receiving this because you commented.Message ID: @.**@.>>

abigailcohen commented 2 years ago

Hi, If you want to use the bee data to try isolate what may be creating this non-reporting of a spatial signal, I have attached the dataset. Considering that I had the same issue with a different set of species data (from the same multi-species database), I don't think it's a data issue but just cannot figure out what in the code could be causing it.

neva.c.csv