Closed rogershijin closed 1 year ago
Hey @rogershijin, thanks for the feedback, no particular reason but that raises an interesting question: should it be n_mod
or n_obs
as the length?
I would expect len(object)
to match object.shape[0]
, and currently the shape of MuData is (n_obs, n_var)
.
Thanks for getting back to me! Yea I think for me also it is more intuitive for this to be n_obs
.
@gtca I'm not sure it makes sense for something to have __len__
without __iter__
. Also defining __len__
implicitly defines __bool__
.
@ivirshup maybe but see anndata.
We can add __iter__
here as well. Is there anything special with the way anndata
does it? I don't see __iter__
there per se.
I'm not sure anndata should have len
either 😅
I think it needs some consideration of why you want the length, and what it should be consistent with. I do think if it were defined n_obs
makes the most sense, but accessing n_obs
or shape
seems like probably the better approach.
I think it's ok to expect objects that have .shape
to have their length defined as .shape[0]
.
Practically, for the workflows as in the scanpy/muon world, one should rather use .n_obs
as explicit is better than implicit.
Thanks for adding __len__
! And I should've mentioned this in the issue description, but on the practical application side I was originally motivated to post this because I couldn't use a PyTorch dataloader to batch load a MuData object without __len__
being defined.
Is there a reason
MuData
doesn't supportlen
?And if not would you consider adding some subclass that does? For example,
Thanks a lot!