PatrickZH / DeepCore

Code for coreset selection methods
MIT License
201 stars 38 forks source link

Question about the application of submodular functions. #3

Closed HoodieSJ closed 1 year ago

HoodieSJ commented 2 years ago

Hi. i am really grateful for your github.

I have question about the detailed application about submodular functions. (i.e. facility location, Graph Cut). It seems like the inputs utilized for submodular selection is per-sample gradients in the training dataset.

How do we define the objective functions for submodular selection (when inputs are gradients)? In the paper "Submodular combinatorial information measures with applications in machine learning", it provides the detailed procedure, however, it is hard to understand how the gradient is utilized to composite the objective function.

Chengcheng-Guo commented 1 year ago

Hi, we really appreciate that.

The objective functions for Graph Cut is $f(X, V) = \lambda\sum{v \in V} \sum{x \in X} s(x, v) - \sum{x, y \in X} s(x, y)$ and for Facility Location $f(X, V) = \sum\limits{v \in V} \max_{x \in X} s(x, v)$, where $X$ is the subset, $V$ the ground set, and $s$ a similarity function. Therefore, while maximizing them, we don't have to explicitly input gradients, but we simply input a similarity matrix among samples as it is done in DeepCore. As for why last-layer gradients instead of other sample information are used to construct the similarity matrix, we have taken reference from the paper Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds. It gives empirical analysis of using gradients to compute similarity.

HoodieSJ commented 1 year ago

I really appreciate your quick response. I completely understood the modeling. thanks.