lm() produces NA coefficients if xselect matrix has more cells(=features) than genes(=samples). Although it's maybe unlikely for the real datasets, for simulated datasets of size 200x2000, this leads to excess NAs in the imputed count matrix because of rank deficient OLS fit. So here I used the LASSO fit to make predictions in such cases which fixes the NA issue.
In addition, the predict() function is less error-prone than adding 1 as a new column, getting coefs and then using matrix multiplication. So I used predict() for OLS as well.
Hello, thanks for your suggestion! Since we have updated the package, it does not rely on LASSO anymore. But we use the predict() function in the new release.
lm()
produces NA coefficients ifxselect
matrix has more cells(=features) than genes(=samples). Although it's maybe unlikely for the real datasets, for simulated datasets of size 200x2000, this leads to excess NAs in the imputed count matrix because of rank deficient OLS fit. So here I used the LASSO fit to make predictions in such cases which fixes the NA issue.In addition, the
predict()
function is less error-prone than adding1
as a new column, getting coefs and then using matrix multiplication. So I usedpredict()
for OLS as well.