pzivich / Delicatessen

Delicatessen: the Python one-stop sandwich (variance) shop 🥪
https://deli.readthedocs.io/en/latest/index.html
MIT License
22 stars 2 forks source link

Automatic Differentiation #24

Closed pzivich closed 1 year ago

pzivich commented 1 year ago

This branch proposes an implementation of automatic differentiation to compute the bread, as mentioned in #23. While there are libraries, like JAX, that implement this, I do not want to add dependencies. Therefore, I have implemented a version of automatic differentiation.

Note that this branch is still under active development. I have not fully tested my implementation of computing the bread. Therefore, only use this branch for preliminary testing.

Note: I am planning on keeping the derivative approximation of the bread as the default. The exact derivatives are going to be an alternative option. Ultimately, any errors should lead the user to fall back to the approximation.

pzivich commented 1 year ago

The addition of ee_glm might make the AD implementation a little more difficult. Specifically, the AD will need to handle the polygamma and digamma functions and be able to get their derivatives

pzivich commented 1 year ago

Okay here is how I think I handle digamma and polygamma.

  1. Write an internal polygamma function. This functions checks the input type. If it is a PrimalTangentPairs object, it calls the internal function on that object. Otherwise, it puts it into the scipy version of the polygamma function. Repeat for digamma.
  2. Add internal polygamma function to PrimalTangentPairs that correctly creates each pair. Follow syntax of polygamma. Same for digamma.

This setup should handle the new ee_glm equations by correctly sorting out the polygamma pieces into the correct spots.

The downside to this is if users wanted to use the polygamma function, they would need to would need to use my internal version when pairing with AD. Thankfully this is all avoided when using numerical approximation

pzivich commented 1 year ago

Okay, I added polygamma but in that process, I discovered a bug (or at least some unexpected behavior). As indicated below, the automatic differentation does not work if * is used. However, it does work with np.dot. This appears to be tied to the use of .T to transpose the resulting 1-by-n matrix. Something funky happens, which I have not figured out

d = pd.DataFrame()
d['X'] = np.log([5, 10, 15, 20, 25])
d['Y'] = [118, 58, 5, 30, 58]
d['I'] = 1

def psi(theta):
    y = np.asarray(d['Y'])[:, None]
    X = np.asarray(d[['I', 'X']])
    beta, alpha = theta[:-1], np.exp(theta[-1])
    beta = np.asarray(beta)[:, None]

    pred_y = np.exp(np.dot(X, beta))
    ee_beta = ((y - pred_y) * X).T

    # This is the simplest example of the issue
    # This version give: ValueError: setting an array element with a sequence.
    ee_alpha = ((1 - y / pred_y) + np.log(alpha * y/pred_y)).T
    # This version works
    ee_alpha = ((1 - y / pred_y) + np.log(np.dot(alpha, y/pred_y))).T
    return np.vstack([ee_beta, ee_alpha])

def internal_sum(theta):
    return np.sum(psi(theta), axis=1)

dx_exact = auto_differentiation([5.503, -0.6019, 0.0166], internal_sum)
dx_approx = approx_fprime([5.503, -0.6019, 0.0166], internal_sum, epsilon=1e-9)
pzivich commented 1 year ago

Okay I believe to have fixed my NumPy woes. Now I need to create autodiff tests for all the available estimating equations (to be safe everything works with what I provide to users).