bgreenwell / investr

Inverse estimation in R
22 stars 6 forks source link

Error inserting new data for multiple linear regression #43

Closed denistanjingyu closed 2 years ago

denistanjingyu commented 3 years ago

Error message:

Error in invest.lm(model, y0 = 92, interval = "Wald", x0.name = "DO_nit1",  : 
  'newdata' must contain a column for each predictor variable used by model (except DO_nit1)

Code for invest function:

invest(model, y0 = 92, 
       interval = "Wald",
       x0.name = "DO_nit1", 
       newdata = data.frame(dehydration_filtrate_input_flow_rate = 1,
                            return_sludge_flow_rate = 4,
                            methanol = 600,
                            salt_iron = 20,
                            ORP_den_1 = -385,
                            ORP_den_2 = -31,
                            pH_den = 7.42,
                            T_den = 35.9,
                            ORP_nit1 = -17,
                            pH_nit1 = 7.2,
                            T_nit1 = 37.3,
                            MLSS_nit2 = 11000,
                            MLVSS_nit2 = 7500,
                            ORP_nit2 = 2,
                            pH_nit2_1 = 6.88,
                            pH_nit2_2 = 6.83,
                            T_nit2 = 37.7,
                            MLSS_llat = 12600,
                            MLVSS_llat = 8600))

For newdata, I tried both slicing (first row and columns used in model except x0) and manually inserting (as above) but the error message still appear. Please assist thank you!

bgreenwell commented 3 years ago

Hi @denistanjingyu, thanks for sharing your issue. Do you have a reproducible example I can run on my end? That would make finding the issue much easier.

denistanjingyu commented 3 years ago

Hi Brandon,

Thanks for the prompt reply. I have attached the R script and a CSV file in the email to go with it. Appreciate your help.

Best regards, Denis


From: Brandon Greenwell notifications@github.com Sent: Monday, January 4, 2021 11:30 PM To: bgreenwell/investr investr@noreply.github.com Cc: Tan Jing Yu, Denis 陈劲宇 denis.tanjingyu@Hotmail.sg; Mention mention@noreply.github.com Subject: Re: [bgreenwell/investr] Error inserting new data for multiple linear regression (#43)

Hi @denistanjingyuhttps://github.com/denistanjingyu, thanks for sharing your issue. Do you have a reproducible example I can run on my end? That would make finding the issue much easier.

― You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/bgreenwell/investr/issues/43#issuecomment-754042287, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AK3T324GW3RFPHEXI32WAU3SYHNJHANCNFSM4VSEAMTA.

bgreenwell commented 3 years ago

Hi @denistanjingyu, the issue is that invest() is checking against the names of all the predictors provided in the formula (which includes those that you try to exclude via the - operator). This should be an easy fix on my end, but not sure when I'd be able to get to it. A simple workaround would be to provide the data via the data argument, but exclude the predictors not used in the model, for example:


cols <- c("DO_nit1", names(newdf))
invest(model, y0 = 92, interval = "Wald", x0.name = "DO_nit1",  newdata = newdf, data = TN_dataset[, cols])
denistanjingyu commented 3 years ago

Thank you for the workaround. May I know whether it's possible to bootstrap for multiple linear regression? All my simulation runs failed.

bgreenwell commented 3 years ago

Certainly the methodology could be extended to the multiple predictor case, but I did not code it in such a way (bootstrap came first in the package). How did the simulation runs fail? Not able to find a point estimate in the search window? Start by turning of confidence interval computation.