VickTaxiarchi commented 4 hours ago

Hi,

I run a simple model:

fit <- svyglm(outc ~ ns(year, df = 4) + cov1 + cov2, family=binomial(link="logit"), design=design_em)

with year ranging from 2010 to 2020.

My aim is to extract 1000 marginal predictions of the outcome within the range year. Ie marginal prediction of outcome when year = 2010, year = 2010.013, year=2010.26, etc. That is to create a plot of a smoother line of the trend of the outcome.

To do that:

I created a small loop to predict each of these values and save them in a data.frame. So I had 1000 predictions of outcome over time.

year_seq <- seq(from = 2010, to = 2020, length.out = 1000)

foreach(value = year_seq) %do% {

Generate the prediction for the current year

prediction <- predict_response(fit , terms = c(paste0("year [", value, "]")))

Assign the prediction to a dynamically named variable

assign(paste0("pred", value), prediction, envir = .GlobalEnv)

Add the current prediction and year to the data frame

predictions_df <- rbind(predictions_df, data.frame(Year = value, Prediction = prediction)) }

To validate the results, I also extracted the predictions of the exact years 2010, 2011, 2012, ... , 2020, as is the original variable in the data.

pred2020<-predict_response(fit, terms=c("year [2010:2020]"))

But the results differed.

Also also extracted up to 2019:

pred2019<-predict_response(fit, terms=c("year [2010:2019]"))

and these again differed to the data frame pred2020.

Therefore my questions are: (i) why the predictions differ (ii) how can I now which are the correct ones (iii) how can I extract the predictions in 1) correctly, since that is my main aim.

Many thanks Vicky

strengejacke commented 3 hours ago

Do you have a reproducible example? It's hard to find out where potential bug are without some more details and a toy example to investigate further.

VickTaxiarchi commented 2 hours ago

Fair enough. Please see attached test data and my code. I simplified it a bit and instead of length.out=1000 changed it to 20 for faster, but essentially should be the same.

Thank you in advance.

Best wishes Vicky

From: Daniel @.> Sent: 08 November 2024 11:18 To: strengejacke/ggeffects @.> Cc: Vicky Taxiarchi @.>; Author @.> Subject: Re: [strengejacke/ggeffects] Predictions extracted with predict_response are not consistent when changing the number of extracts (Issue #601)

Do you have a reproducible example? It's hard to find out where potential bug are without some more details and a toy example to investigate further.

— Reply to this email directly, view it on GitHub [github.com]https://urldefense.com/v3/__https:/github.com/strengejacke/ggeffects/issues/601*issuecomment-2464446786__;Iw!!PDiH4ENfjr2_Jw!F3Ie-529vdNUevVmfCOJg-X-KPI1b2wrhcWxboMRnWr4yqZSnS7f7XRpwxOekZYneUFEY-B3t_RvtZ1aDeaGWPqF53PhoX3yhF_Ijtc$, or unsubscribe [github.com]https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AZXMCFYXH2EZG4EX7ZI4LDTZ7SMVBAVCNFSM6AAAAABRNE4EU6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRUGQ2DMNZYGY__;!!PDiH4ENfjr2_Jw!F3Ie-529vdNUevVmfCOJg-X-KPI1b2wrhcWxboMRnWr4yqZSnS7f7XRpwxOekZYneUFEY-B3t_RvtZ1aDeaGWPqF53PhoX3yaCJ8SSw$. You are receiving this because you authored the thread.Message ID: @.**@.>>

strengejacke commented 1 hour ago

I think if you sent attachments via mail as reply to GitHub comments, it's not attached on the GitHub site :-)

Can you attach your files directly via the website?

VickTaxiarchi commented 1 hour ago

No prob. Code in the word file. my_data.csv Code for GitHub.docx

strengejacke commented 56 minutes ago

I see what you would like to achieve, more values for a smoother plot. There must be something wrong in your for-loop, but you can avoid this and add the sequence directly:

library(survey)
library(splines)
library(ggeffects)

my_data <- datawizard::data_read("~/../Downloads/my_data.csv")

design <- svydesign(ids = ~1, data = my_data, weights = ~weights)

fit <- svyglm(
  outc ~ ns(year, df = 4) + cov1 + cov2 + cov3, 
  family = binomial(), 
  design = design
)
#> Warning in eval(family$initialize): non-integer #successes in a binomial glm!

pred_df <- predict_response(fit, terms = "year [2010:2020 by=0.1]")
plot(pred_df)

or you create a vector and add that one to the terms argument:

values <- seq(2010, 2020, by = 0.1)
pred_df <- predict_response(fit, terms = "year [values]")
plot(pred_df)

strengejacke commented 55 minutes ago

To get proper axis breaks/labels, simply add a ggplot-layer:

plot(pred_df) + ggplot2::scale_x_continuous(breaks = seq(2010, 2020, 2))

VickTaxiarchi commented 33 minutes ago

Oh gee thank you. I was trying to find if there was such an option, could see it anywhere. Glad it was so simple.

Many thanks

Best wishes Vicky

From: Daniel @.> Sent: 08 November 2024 13:56 To: strengejacke/ggeffects @.> Cc: Vicky Taxiarchi @.>; Author @.> Subject: Re: [strengejacke/ggeffects] Predictions extracted with predict_response are not consistent when changing the number of extracts (Issue #601)

I see what you would like to achieve, more values for a smoother plot. There must be something wrong in your for-loop, but you can avoid this and add the sequence directly:

library(survey)

library(splines)

library(ggeffects)

my_data <- datawizard::data_read("~/../Downloads/my_data.csv")

design <- svydesign(ids = ~1, data = my_data, weights = ~weights)

fit <- svyglm(

outc ~ ns(year, df = 4) + cov1 + cov2 + cov3,

family = binomial(),

design = design

)

> Warning in eval(family$initialize): non-integer #successes in a binomial glm!

pred_df <- predict_response(fit, terms = "year [2010:2020 by=0.1]")

plot(pred_df)

[Image removed by sender.][camo.githubusercontent.com]https://urldefense.com/v3/__https:/camo.githubusercontent.com/12e0f6fd87634c598736162afd14f15207b19a46f7d3203dc36daa34925b83fd/68747470733a2f2f692e696d6775722e636f6d2f474f644a5647652e706e67__;!!PDiH4ENfjr2_Jw!Ale4UQFnS2_JsyIDPbZip0nu1rcImKZSZPsdvL3YU_7m_mS5YiC7vv6UrjDEqvTLYbZgPC_9wlksmJkqsakwxod3Rl07Y2dPmKObay4$

or you create a vector and add that one to the terms argument:

values <- seq(2010, 2020, by = 0.1)

pred_df <- predict_response(fit, terms = "year [values]")

plot(pred_df)

[Image removed by sender.][camo.githubusercontent.com]https://urldefense.com/v3/__https:/camo.githubusercontent.com/4efcc5da1a362ca4f05c421539c5353913a76fea44455ab73c38e06777e3487f/68747470733a2f2f692e696d6775722e636f6d2f376d534448336d2e706e67__;!!PDiH4ENfjr2_Jw!Ale4UQFnS2_JsyIDPbZip0nu1rcImKZSZPsdvL3YU_7m_mS5YiC7vv6UrjDEqvTLYbZgPC_9wlksmJkqsakwxod3Rl07Y2dPtshg40I$

— Reply to this email directly, view it on GitHub [github.com]https://urldefense.com/v3/__https:/github.com/strengejacke/ggeffects/issues/601*issuecomment-2464827986__;Iw!!PDiH4ENfjr2_Jw!Ale4UQFnS2_JsyIDPbZip0nu1rcImKZSZPsdvL3YU_7m_mS5YiC7vv6UrjDEqvTLYbZgPC_9wlksmJkqsakwxod3Rl07Y2dPO8mzLEc$, or unsubscribe [github.com]https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AZXMCF6D5AVT4LRLDZ575BLZ7S7HBAVCNFSM6AAAAABRNE4EU6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRUHAZDOOJYGY__;!!PDiH4ENfjr2_Jw!Ale4UQFnS2_JsyIDPbZip0nu1rcImKZSZPsdvL3YU_7m_mS5YiC7vv6UrjDEqvTLYbZgPC_9wlksmJkqsakwxod3Rl07Y2dP_kourH0$. You are receiving this because you authored the thread.Message ID: @.***>

strengejacke commented 17 minutes ago

Great, good to see this solves your request!

strengejacke / ggeffects

Predictions extracted with predict_response are not consistent when changing the number of extracts #601

Generate the prediction for the current year

Assign the prediction to a dynamically named variable

Add the current prediction and year to the data frame

> Warning in eval(family$initialize): non-integer #successes in a binomial glm!