locuslab / torchdeq

Modern Fixed Point Systems using Pytorch
MIT License
76 stars 8 forks source link

Unexpected behaviour of indexing? #6

Open BurgerAndreas opened 3 months ago

BurgerAndreas commented 3 months ago

Hi,

Thanks again for this library!

  1. Am I right in that n_states / indexing can be used to implement the sparse fixed-point correction of DEQ Optical Flow?

  2. If yes, I am confused about the output in this example:

    
    from torchdeq import get_deq

Settings from DEQ Optical Flow paper

args = { "n_states": 2, "f_max_iter": 24, }

deq = get_deq(args)

print('deq.indexing: ', deq.indexing)


Output: `deq.indexing:  [12, 12]`
Expected output: `[8, 16]` (uniformly sample between 0 and 24)

Am I missinterpreting?
BurgerAndreas commented 3 months ago

I realised setting core="indexing" yields the more expected behaviour deq.indexing: [12, 24]

Full example:

from torchdeq import get_deq

# Settings from `DEQ Optical Flow` paper
args = {
    "core": "indexing",
    "n_states": 2,
    "f_max_iter": 24,
}

deq = get_deq(args)

print('deq.indexing: ', deq.indexing)

Question

Can you explain what the usecase (pros and cons) for core="indexing" and core="sliced" are? From the documentation:

DEQIndexing and DEQSliced build different computational graphs in training but keep the same for test.

For DEQIndexing, it defines a computational graph with tracked gradients by indexing the internal solver states and applying the gradient function to the sampled states. This is equivalent to attaching the gradient function aside the full solver computational graph. The maximum number of DEQ function calls is defined by args.f_max_iter.

For DEQSliced, it slices the full solver steps into several smaller graphs (w/o grad). The gradient function will be applied to the returned state of each subgraph. Then a new fixed point solver will resume from the output of the gradient function. This is equivalent to inserting the gradient function into the full solver computational graph. The maximum number of DEQ function calls is defined by, for example, args.f_max_iter + args.n_states * args.grad.