Closed noahgolmant closed 5 years ago
This is a noob question, but doesn't the DataLoader
being passed in determine the batch size? i.e. if you wanted to compute the estimate on more data, why not simply pass in a loader with a larger batch size?
It's true that you can modify the dataloader batch size, but then you're limited to the max batch size you can fit on your GPUs. If you want to do vanilla power iteration, for example, you need to feed in the whole dataset each time.
The implementation doesn't seem to work currently; when the DatLoader uses a batch size > 512 (which is the default max_size), this part crashes:
84 if grad_vec:
85 grad_vec += torch.cat([g.contiguous().view(-1) for g in grad_dict])
86 else:
87 grad_vec = torch.cat([g.contiguous().view(-1) for g in grad_dict])
because grad_vec
is not a boolean anymore in the second iteration. If there is only one chunk, the loop only runs once.
Maybe use grad_vec = None
and then later test for if grad_vec is not None
?
Currently, batch sizes of ~128 can still exhibit high variance in the eigenvalue estimates. It would be useful to add the ability to take a single step with multiple batches to approach vanilla power iteration to improve this.