const-ae / glmGamPoi

Fit Gamma-Poisson Generalized Linear Models Reliably
105 stars 15 forks source link

Design with `NA` values causes nondescriptive error message #26

Closed lysogeny closed 3 years ago

lysogeny commented 3 years ago

Hey Constantin,

I think I found a small bug in your package.

Using column data where some values are NA in a design throws an error with the message: Number of rows in col_data does not match number of columns of data

As an example:

library("tidyverse")
library("glmGamPoi")
d <- matrix(rpois(5*6, 10), ncol=5)
d.col <- data.frame(var=c(1,2,3,4,NA))
glm_gp(design=~var, col_data=d.col, data=d)

Throws

Error in handle_design_parameter(design, data, col_data, reference_level, : Number of rows in col_data does not match number of columns of data
Traceback:

1. glm_gp(design = ~var, col_data = d.col, data = d)
2. handle_design_parameter(design, data, col_data, reference_level, 
 .     offset)
3. stop("Number of rows in col_data does not match number of columns of data")

Obviously this should not happen as we have 5 rows in the data.frame and 5 columns in the matrix. This can be fixed by removing the NA entries:

glm_gp(design=~var, col_data=d.col[!is.na(d.col$var),,drop=F], data=d[,!is.na(d.col$var)])

Returns

glmGamPoiFit object:
The data had 6 rows and 4 columns.
A model with 2 coefficient was fitted.
const-ae commented 3 years ago

Hi Jooa,

thanks for the report and sorry for the delay, I was on holiday for the last two weeks.

You are right, the original error message was rather confusing. I have fixed the issue and return a more helpful error message now:

library("glmGamPoi")
d <- matrix(rpois(5*6, 10), ncol=5)
d.col <- data.frame(var=c(1,2,3,4,NA))
glm_gp(design=~var, col_data=d.col, data=d)
#> Error in handle_design_parameter(design, data, col_data, reference_level): The design matrix contains 'NA's for sample 5. Please remove them before you call 'glm_gp()'.

Created on 2021-08-06 by the reprex package (v2.0.0)

Best, Constantin