projectglow / glow

An open-source toolkit for large-scale genomic analysis
https://projectglow.io
Apache License 2.0
262 stars 106 forks source link

Move alpha into grouping expression for ridge regression fit #289

Open karenfeng opened 3 years ago

karenfeng commented 3 years ago

During RidgeRegression.fit(), we group on ['sample_block', 'label'] but not on 'alpha'. This becomes a limiting factor on our scalability due to PyArrow limits, which constrain the number of 8-byte float values in a vector to 132,152,839; see the limits described in https://github.com/projectglow/glow/pull/282.

henrydavidge commented 3 years ago

@karenfeng did we end up doing this one?

karenfeng commented 3 years ago

No, I believe this is unchanged: https://github.com/projectglow/glow/blob/a63306ee307e9209bce0ea3f7f6d74d60bb8ab32/python/glow/wgr/linear_model/ridge_model.py#L217