Closed MarcAntoineSchmidtQC closed 7 months ago
Yes, I strongly approve of having a clearly documented API that is consistent between the various MatrixBase classes! Thanks for pointing this out, Marc.
Part of the task here seems like a question of where to enumerate this. In the docstrings of the MatrixBase class? Somewhere else?
Should we define the more basic
__mul__
and__rmul__
?
Yes, please! With a consistent API across subtypes. At the moment, *
performs element-wise multiplication for dense matrices (NumPy style), matrix multiplication for sparse matrices (SciPy) and failed multiplication for categorical matrices (my own).
Marc and I talked a couple days ago about some indexing issues where the capabilities of DenseMatrix greatly exceeded that of other matrix types because the underlying numpy arrays have a lot of features. It seems like this will be the case for many operations. Would it make sense to explicitly forbid a lot of DenseMatrix operations? With an error message something like "That operation is not supported by quantcore.matrix. Please access the underlying numpy array if you want to perform advanced numpy operations." That would dramatically reduce our API surface area in some places where I'd prefer not to have to implement tons of stuff for e.g. CategoricalMatrix and SplitMatrix.
To my mind, it's okay if one subtype has a fuller API than the others. I only mentioned __mul__
(and indexing elsewhere) because it seems pretty elementary to me. It comes up naturally if you want to, say, create interactions between columns of a SplitMatrix
.
I think we should distinguish between a fuller API and an inconsistent API. Fuller is fine but inconsistent is not. Does that seem reasonable? And like the title of this issue is saying, we should also identify the minimal API.
Just wanted to summarize. I don't think this is disagreeing with any of the above posts. =)
Going through old issues and found this one. This issue will be solved by #286.
A cleanup would probably help us identify low-hanging fruits.
For instance, should we define
__matmul__
for all the classes? Should we define the more basic__mul__
and__rmul__
?I would also like to standardize all our
__repr__
methods so that a user knows quickly that these objects are coming from the same package.