Meta gradient Computation

ars22 commented 4 years ago

Hi,

Thanks for making your implementation public! Really loved your work :) I wasn't able to find the exact function which returned the meta gradient (computed using the implicit function theorem). Specifically I was looking for the Jacobian computation (involved in the inverse in Thm. 1 of the paper). Could you please point me to the correct function for this?

Thanks, Amrith

kjunelee commented 4 years ago

During the project development, we discovered there exists a prior PyTorch implementation of implicit function theorem (https://github.com/locuslab/qpth; This is an implementation of ICML 2017 paper OptNet: Differentiable Optimization as a Layer in Neural Networks). This package supports implicit differentiation through quadratic programming (QP) solver. Since SVM and ridge regression can be formulated as QPs, we rely on their package to compute the meta-gradient.

Recently, a general implementation of implicit differentiation for convex programs was released at (https://github.com/cvxgrp/cvxpylayers; Differentiable Convex Optimization Layers (NeurIPS 2019)). You might want to check this out. It is easy to import this package on top of our meta-learning codebase.

ars22 commented 4 years ago

Thank you for the quick response. It was very helpful! Also, I wanted to confirm a few details for your Mini-imagenet 5w1s setup. In the paper (Table 3), for the 4-layer conv model, the accuracy reported is 52.87. How many shots, ways were used at meta-train time for this experiment? Also, was the feature extractor pertained on the 64 train classes?

kjunelee commented 4 years ago

1 shot was used for the 4-layer Convolutional model. We did not use any pretaining scheme in this paper. All experiments are meta-trained with 64 categories (unless marked -trainval in the table).

kjunelee / MetaOptNet

Meta gradient Computation #46