Closed ArnNag closed 5 months ago
is this PR ready for review?
A question I have is about how to integrate this into the PyTorch Lightning Dataloader API. I have currently implemented the maximum edge dataloader in a separate function here.
There are currently two issues with overriding the existing train_dataloader
function:
batch_sampler
argument that behaves differently from the behavior when using the batch_size
argument to DataLoader
. The dataset generation for test cases depends on these expectations.float('inf')
when we want to use all edges, but we would have reimplement downstream tests to work with the shapes that this outputs.)Can we hold off with this one for now and concentrate on implementing SAKE?
Description
The intermediate layers' edgewise features usually account for most of the GPU memory usage of a batch, based on my previous experience trying to train SAKE. This PR implements a dynamically batching dataloader that computes the number of edges in the upcoming conformer and adds conformers to the batch until a maximum number of edges is reached.
Todos
Questions
Status