invenia / Nabla.jl

A operator overloading, tape-based, reverse-mode AD
Other
68 stars 5 forks source link

Sensitivities for / and \ are incorrect for rectangular matrices #175

Open ararslan opened 5 years ago

ararslan commented 5 years ago

The sensitivities for / and \ seem to match the definitions for matrix inverse products in the Giles paper, and they work as tested on square matrices. However, the formulas in the paper assume that the matrix is invertible, which also implies that it's square. Julia itself does not impose such a restriction; / and \ are defined for rectangular matrices. The sensitivities for the matrix inverse product no longer apply in that case though:

julia> n, m = 3, 4;

julia> A = randn(m, n); B = randn(m, n); Ȳ = randn(size(A / B));

julia> check_errs(/, Ȳ, (A, B), (randn(size(A)), randn(size(B))))
ERROR: "</> allocated": large deviation from reference:
  relative error:  8.757e-01
    tolerance:     1.000e-07
  absolute error:  1.153e+02
    tolerance:     1.000e-10

which means we'll need to reformulate our sensitivities for / and \.

willtebbutt commented 5 years ago

I vote that we:

  1. Leave the Nabla rules for now, or maybe just throw an error if we ever encounter a non-square matrix.
  2. Implement this properly in ChainRules, by coping over the PR I just made to Zygote.
  3. Let the sensitivity in Nabla be fixed automatically when we upgrade to ChainRules.
ararslan commented 5 years ago

Sure, that seems fine to me, assuming we aren't hitting any important rectangular cases internally.