Open vincentarelbundock opened 3 years ago
Your initial issue should be fixed:
library(insight)
library(sampleSelection)
#> Loading required package: maxLik
#> Loading required package: miscTools
#>
#> Please cite the 'maxLik' package as:
#> Henningsen, Arne and Toomet, Ott (2011). maxLik: A package for maximum likelihood estimation in R. Computational Statistics 26(3), 443-458. DOI 10.1007/s00180-010-0217-1.
#>
#> If you have questions, suggestions, or comments regarding the 'maxLik' package, please use a forum or 'tracker' at maxLik's R-Forge site:
#> https://r-forge.r-project.org/projects/maxlik/
data(Mroz87, package = "sampleSelection")
Mroz87$kids <- (Mroz87$kids5 + Mroz87$kids618 > 0)
model <- heckit(
lfp ~ age + I(age^2) + faminc + kids + educ,
wage ~ exper + I(exper^2) + educ + city,
data = Mroz87)
find_variables(model)
#> $response
#> [1] "lfp" "wage"
#>
#> $conditional
#> $conditional$selection
#> [1] "age" "faminc" "kids" "educ"
#>
#> $conditional$outcome
#> [1] "exper" "educ" "city"
Created on 2021-09-14 by the reprex package (v2.0.1)
But what would you suggest according to the response? Return a list of two as well?
library(insight)
library(sampleSelection)
#> Loading required package: maxLik
#> Loading required package: miscTools
#>
#> Please cite the 'maxLik' package as:
#> Henningsen, Arne and Toomet, Ott (2011). maxLik: A package for maximum likelihood estimation in R. Computational Statistics 26(3), 443-458. DOI 10.1007/s00180-010-0217-1.
#>
#> If you have questions, suggestions, or comments regarding the 'maxLik' package, please use a forum or 'tracker' at maxLik's R-Forge site:
#> https://r-forge.r-project.org/projects/maxlik/
data(Mroz87, package = "sampleSelection")
Mroz87$kids <- (Mroz87$kids5 + Mroz87$kids618 > 0)
model <- heckit(
lfp ~ age + I(age^2) + faminc + kids + educ,
wage ~ exper + I(exper^2) + educ + city,
data = Mroz87)
# could be a list with "$selection" and "$outomce"
find_response(model)
#> [1] "lfp" "wage"
Created on 2021-09-14 by the reprex package (v2.0.1)
Thanks for the fix! (I don't know why I'm still surprised/impressed by your efficiency, but I am...)
Your question about find_response
is a very good one. The list option would be more informative, but the current documentation says:
Value:
The name(s) of the response variable(s) from ‘x’ as character vector, or ‘NULL’ if response variable could not be found.
The plural already accommodates multiple response variables, so this would require no change, whereas the list might break code that expects a vector. So for backward compatibility reasons, I would lean toward keeping the vector, but both options are probably fine.
I would suggest a named list of 2 for the response here. The documentation as written I think was intended more for something like an mlm
, where there are multiple undifferentiated responses
Sounds good. Just so I know what to expect, you are proposing a list for all models, not just for those with many responses?
Or are you suggesting a string in most cases and a list in cases like sampleSelection
.
The first option is a breaking change but might be better in the future for consistency. The second is non-breaking, but it's not great in terms of consistency because users have to check every time if they get a string or list.
Edit: meant a string not a vector
Actually, I think we have a named vector for find_response()
and could use a list as list-element for find_variables()
(i.e. having $response$selection
and $response$outcome
)
A possible bug: I could not use
find_variables
on a model produced bysampleSelection::heckit
orsampleSelection::selection
. Example: