Adjusted predictions different between log(Y) ~NO and Y~LOGNO

pntoiv commented 1 month ago

Hey!

I am running a lognormal model and plotting its adjusted predictions using ggeffects. When I ran my lognormal model with lognormal distribution (GAMLSS-package) I got negative values, which is not plausible for my response variable. I then tried the same by manually logtransforming my response variable Y and then running the model with normal distribution. The models provided by gamlss-package are exactly the same as expected. But the figures provided by ggeffects now differ drastically in their scale. The figures now don't go to negatives but start at 1.

I just need a confirmation that this is due to ggeffects not recoqnising that the model has performed log-transformation. If so, then this is understandable

strengejacke commented 1 month ago

Do you have a reproducible example?

pntoiv commented 1 month ago

Do you have a reproducible example?

No, but I think you can try this with any data when running lognormal model in two ways using gamlss. First run it with log-transformed response variable and normal distribution and then without transformation but lognormal distribution.

I believe this is due to ggeffects not recoqnising that the response variable has been logtransformed by the model. When I did the transformation myself, ggeffects told me it is doing backtransformation, while it did not tell me so when I let the model do the logtransformation via lognormal distribution. And the model stores the predicted values in logarithmic scale.

strengejacke commented 1 month ago

Usually the correct back transformation is done via the link inverse function. That's why a reproducible example would be helpful to see your model specification, because I don't work with that package.

pntoiv commented 1 month ago

Usually the correct back transformation is done via the link inverse function. That's why a reproducible example would be helpful to see your model specification, because I don't work with that package.

Here is related information about log-normal distributions gamlss uses. https://rdrr.io/cran/gamlss.dist/man/LNO.html

I used the LOGNO() distribution (and not LOGNO2), which is by default defined as LOGNO(mu.link = "identity", sigma.link = "log"). I think this mu.link is the problem here?

"The functions LOGNO and LOGNO2 define a gamlss.family distribution to fits the log-Normal distribution. The difference between them is that while LOGNO retains the original parametrization for mu, (identical to the normal distribution NO) and therefore \mu=(-\infty,+\infty), the function LOGNO2 use mu as the median, so \mu=(0,+\infty)."

strengejacke commented 1 month ago

ok, should be fixed in https://github.com/easystats/insight/pull/899

library(gamlss)
#> Loading required package: splines
#> Loading required package: gamlss.data
#> 
#> Attaching package: 'gamlss.data'
#> The following object is masked from 'package:datasets':
#> 
#>     sleep
#> Loading required package: gamlss.dist
#> Loading required package: nlme
#> Loading required package: parallel
#>  **********   GAMLSS Version 5.4-22  **********
#> For more on GAMLSS look at https://www.gamlss.com/
#> Type gamlssNews() to see new features/changes/bug fixes.

library(ggeffects)
data(abdom)

m1 <- gamlss(y ~ x, family = LOGNO, data = abdom)
#> GAMLSS-RS iteration 1: Global Deviance = 5779.746 
#> GAMLSS-RS iteration 2: Global Deviance = 5779.746

m2 <- gamlss(log(y) ~ x, data = abdom)
#> GAMLSS-RS iteration 1: Global Deviance = -724.4265 
#> GAMLSS-RS iteration 2: Global Deviance = -724.4265


predict_response(m1)
#> $x
#> # Predicted values of y
#> 
#>  x | Predicted |         95% CI
#> -------------------------------
#> 10 |     84.83 |  82.80,  86.90
#> 15 |    109.78 | 107.75, 111.86
#> 20 |    142.09 | 140.11, 144.09
#> 25 |    183.89 | 181.88, 185.93
#> 30 |    238.00 | 235.36, 240.67
#> 35 |    308.03 | 303.62, 312.49
#> 40 |    398.66 | 391.07, 406.39
#> 45 |    515.95 | 503.37, 528.85
#> 
#> 
#> attr(,"class")
#> [1] "ggalleffects" "list"        
#> attr(,"model.name")
#> [1] "m1"

predict_response(m2)
#> Model has log-transformed response. Back-transforming predictions to
#>   original response scale. Standard errors are still on the transformed
#>   scale.
#> $x
#> # Predicted values of y
#> 
#>  x | Predicted |         95% CI
#> -------------------------------
#> 10 |     84.83 |  82.80,  86.90
#> 15 |    109.78 | 107.75, 111.86
#> 20 |    142.09 | 140.11, 144.09
#> 25 |    183.89 | 181.88, 185.93
#> 30 |    238.00 | 235.36, 240.67
#> 35 |    308.03 | 303.62, 312.49
#> 40 |    398.66 | 391.07, 406.39
#> 45 |    515.95 | 503.37, 528.85
#> 
#> 
#> attr(,"class")
#> [1] "ggalleffects" "list"        
#> attr(,"model.name")
#> [1] "m2"

^{Created on 2024-07-04 with reprex v2.1.0}

strengejacke / ggeffects

Adjusted predictions different between log(Y) ~NO and Y~LOGNO #554