data-apis / array-api

RFC document, tooling and other content related to the array API standard
https://data-apis.github.io/array-api/latest/
MIT License
214 stars 44 forks source link

linalg.trace should just support square inputs #359

Closed lezcano closed 2 years ago

lezcano commented 2 years ago

While reviewing the implementation of torch.linalg.trace https://github.com/pytorch/pytorch/pull/62714#discussion_r778833069, we realised that linalg.trace supports non-square matrices.

This should not be the case. For example, the whole wikipedia page for trace always assumes that the matrix is square.

For the abstract reasons for why don't we want to support rectangular matrices I'd say that, even in its most abstract definition on symmetric monoidal categories, the trace is just defined for endomorphisms, that is, square matrices when C = Finite dimensional vector spaces.

In summary, even though we could support this, I don't think we should.

kgryte commented 2 years ago

The current behavior of linalg.trace is consistent with linalg.diagonal, and, personally, I can sympathize with (non-mathematically inclined) expectations that the APIs and behavior for these two functions match.

We already expand on the definition of the trace by supporting an offset kwarg and thus computing a sum along diagonals other than the main diagonal.

Accordingly, while generally I favor aligning with mathematical definitions, given the widespread support for NumPy's API, I'm in favor of leaving the spec as is.

lezcano commented 2 years ago

Since this behaviour may be achieved with linalg.diagonal + sum, I think it should be left to that and keep linalg.trace to be defined as it is in mathematics so that libraries are allowed to throw an error message for non-square matrices if they choose to do so.

Note that this does not disallow different libraries to provide extensions that support the non-square behaviour. Now, not forcing them to implement it would be great.

As a side note, I'd be interested in knowing of an example where the offset kw is used. Otherwise, in my opinion, it'd be best to keep it out of the API as well.

kgryte commented 2 years ago

This was discussed during the most recent consortium meeting (1-6-2021).

Consensus there was that the existing API should remain as is given that restricting the API to square matrices and removing support for the offset keyword argument may result in backward incompatible changes for those libraries currently supporting such behavior (essentially all libraries except PyTorch allow non-square matrices and all except PyTorch and TF support offset).

While NumPy could introduce such changes in a dedicated array_api namespace, other libraries may be forced to make changes in their main namespace, especially if they are aiming to exactly mirror NumPy's array API namespace in their main namespace.

In general, the specification tries to avoid introducing backward incompatible changes unless there is a very compelling reason to do so, and, in this case, mathematical purity was not a sufficiently compelling reason.

Will close out...

lezcano commented 2 years ago

Specifying the function just for square inputs doesn't make the libraries do anything (let that be implement support or throw an error), right? In that case, I don't see how this proposal would incur in BC-breaking errors.

rgommers commented 2 years ago

In that case, I don't see how this proposal would incur in BC-breaking errors.

The point was that there are already two libraries with shipped releases containing trace. E.g., https://github.com/numpy/numpy/blob/main/numpy/array_api/linalg.py#L363.

Since this is a minor issue either way, the decision to not change it anymore makes sense. If this was brought up 4 months ago, it'd be a story purely about pros and cons of a new design, now it is not. You actually reviewed trace 6 months ago (gh-215) and didn't bring up this issue then, so it's not that obvious it looks like.