noahgolmant / pytorch-hessian-eigenthings

Efficient PyTorch Hessian eigendecomposition tools!
MIT License
360 stars 43 forks source link

Add option to use multiple batches for a single step of power iteration #4

Closed noahgolmant closed 5 years ago

noahgolmant commented 5 years ago

Currently, batch sizes of ~128 can still exhibit high variance in the eigenvalue estimates. It would be useful to add the ability to take a single step with multiple batches to approach vanilla power iteration to improve this.

themightyoarfish commented 5 years ago

This is a noob question, but doesn't the DataLoader being passed in determine the batch size? i.e. if you wanted to compute the estimate on more data, why not simply pass in a loader with a larger batch size?

noahgolmant commented 5 years ago

It's true that you can modify the dataloader batch size, but then you're limited to the max batch size you can fit on your GPUs. If you want to do vanilla power iteration, for example, you need to feed in the whole dataset each time.

themightyoarfish commented 5 years ago

The implementation doesn't seem to work currently; when the DatLoader uses a batch size > 512 (which is the default max_size), this part crashes:

 84             if grad_vec:
 85                 grad_vec += torch.cat([g.contiguous().view(-1) for g in grad_dict])
 86             else:
 87                 grad_vec = torch.cat([g.contiguous().view(-1) for g in grad_dict])

because grad_vec is not a boolean anymore in the second iteration. If there is only one chunk, the loop only runs once.

themightyoarfish commented 5 years ago

Maybe use grad_vec = None and then later test for if grad_vec is not None?