Closed chinandrew closed 3 years ago
Hm, I always intended n_success
and n_trial
to be numpy vectors and am not even sure if the Gibbs sampler would run correctly if they are scipy sparse matrices. (I document their types at a higher-level in RegressionModel
but was lazy to document at lower levels.)
Anyways, I totally agree that the use of len()
is an obstacle in trying out cupy
and is a plain bad style. I find using .shape[0]
using a one-dimensional array a bit confusing, so maybe .size
?
Hm, I always intended
n_success
andn_trial
to be numpy vectors and am not even sure if the Gibbs sampler would run correctly if they are scipy sparse matrices. (I document their types at a higher-level inRegressionModel
but was lazy to document at lower levels.)Anyways, I totally agree that the use of
len()
is an obstacle in trying outcupy
and is a plain bad style. I find using.shape[0]
using a one-dimensional array a bit confusing, so maybe.size
?
Oh I see what I did. So I didn't actually hit this when running cupy tests, but rather when trying to use the gi_bleed data on the master branch, since y, X = dm.read_ohdsi_data()
makes y into a sparse matrix, and len(y)
fails. What I missed was the line y = np.squeeze(y.toarray())
that came right after that in the notebook that converts it to a numpy array. No wonder I was hitting so many issues :facepalm: . I'll close this for now; we can make changes for gpu support as needed.
y, X = dm.read_ohdsi_data() makes y into a sparse matrix, and len(y) fails. What I missed was the line y = np.squeeze(y.toarray()) that came right after that in the notebook that converts it to a numpy array
Ah, sorry you had to suffer from my crappy code. I can't remember, but I don't see any good reason why dm.read_ohdsi_data()
should return a sparse matrix.
len()
doesn't work with scipy sparse matrices (it raisesTypeError: sparse matrix length is ambiguous; use getnnz() or shape[0]
), so changing to use shape[0].This just affects the
logistic_model
class for now. I think some of the other models may hit this issue as well, but I don't have a great sense of where they're potentially using sparse matrices or not. I can also just convert everylen()
in themodel/
folder if that's safe to do.