jmboehm / GLFixedEffectModels.jl

Fast estimation of generalized linear models with high dimensional categorical variables in Julia
Other
33 stars 6 forks source link

Method Error when calling nlreg() on DataFrame object #58

Closed peteratkatchenko closed 2 months ago

peteratkatchenko commented 4 months ago

Hello,

I recently attempted to call the nlreg() function on a simplified version of a larger model, but I encountered a puzzling error message.

The function call:

nlreg(extensive_counts_1, @formula(patents_count ~ binary_own), Poisson(), LogLink())

The error message:

ERROR: MethodError: no method matching (Vector{<:AbstractVector})(::Vector{Any}) Stacktrace: [1] convert(::Type{Vector{<:AbstractVector}}, a::Vector{Any}) @ Base .\array.jl:665

'extensive_counts_1' is a DataFrame object, whereas 'patents_count' and 'binary_own' are Int64. My first intuition is that the code might be having trouble due to the presence of 'missing' values in the data frame. Please let me know your thoughts!

jmboehm commented 4 months ago

Can you construct a minimal working example to reproduce this error?

peteratkatchenko commented 4 months ago

The following is a self-contained example that reproduces the same error message:

patents_count = rand(Poisson(50), 100)

binary_own = rand([0, 1], 100)

example = DataFrame(patents_count = patent_counts, binary_own = binary_own)

nlreg(example, @formula(patents_count ~ binary_own), Poisson(), LogLink())

And I receive the following (truncated) error message:

ERROR: MethodError: no method matching (Vector{<:AbstractVector})(::Vector{Any}) Stacktrace: [1] convert(::Type{Vector{<:AbstractVector}}, a::Vector{Any}) @ Base .\array.jl:665

jmboehm commented 4 months ago

Perhaps this is just because you haven't specified a fixed effect? For models without fixed effects I'd definitely recommend to use GLM.jl. But a nicer error message would definitely be an improvement here.

peteratkatchenko commented 4 months ago

I tried running the model with fixed effects specified and the error mentioned above did indeed go away. But now I receive an error saying "IWLS Weights are not finite. Possible reason is separation." This error persists independent of the variable that I specify as the fixed effect. Do you have any advice for dealing with this?

jmboehm commented 4 months ago

The likely explanation is statistical separation, see e.g. this link. If you understand where the problem is coming from, it's usually possible to get useful estimates of the parameters of interest using the separation option (and related ones). But this is very specific to your data and problem. The truth is that estimating nonlinear FE models is often not a case of "fire and forget" like with linear models.

peteratkatchenko commented 4 months ago

Thank you!

jmboehm commented 2 months ago

60 implements an error message if there's no fixed effect