Varying-Size Hierarchical Graph Batches

bknyaz / bmvc_2019

PyTorch code for our BMVC 2019 paper "Image Classification with Hierarchical Multigraph Networks"

https://arxiv.org/abs/1907.09000

Other

60 stars 17 forks source link

Varying-Size Hierarchical Graph Batches #2

Closed TwiggyDaniels closed 4 years ago

TwiggyDaniels commented 4 years ago

Hey so I was just wondering how/if you implemented/used a DataLoader for batches of hierarchical graphs?

torch.from_numpy(np.concatenate((np.concatenate(coord_multiscale),
        np.concatenate(avg_values_multiscale)), axis=1)).unsqueeze(0).float()

and

torch.from_numpy(np.stack((A_spatial_multiscale, A_hier), axis=2)).unsqueeze(0).float()

produce varying-size tensors between images. So using PyTorch's DataLoader isn't feasible and I'm not quite sure of an efficient way to implement my own.

I could hack my way around it with multiple calls to DataLoader with a batch size of 1, but that would be hilariously inefficient. I considered padding, but this would defeat the point of handling arbitrary-sized inputs.

bknyaz commented 4 years ago

Hi, there are (at least) two ways to handle this:

As you suggested, using zero-padding (both for node features and adjancy matrices). The efficient way is to use a custom collate function that you can pass to PyTorch Dataloader. I have an example of this function in my another repo https://github.com/bknyaz/graph_nn/blob/master/graph_unet.py#L706. On average, in a batch there will be much less variation of graph size compared to the whole dataset. I don't think the main point of graph networks is to handle arbitrary sized inputs. It's more about the order of input nodes that is not defined in graph networks. So, the batch wise padding is fine in most cases.
You can use some efficient graph toolboxes like pytorch geometric. They treat a batch of graphs as a single big graph (without edges between the graphs), and work with edges in the form of a list of non-zero edges instead of adjacency matrices. Which option to use depends on your data. The second option is more generic and will work well for all cases, but requires some efforts to learn how to use the toolbox. The first option is fine when your graphs do not vary in size a lot, so you will not notice a big difference between the options in this case.

TwiggyDaniels commented 4 years ago

I ended up using PyTorch Geometric like you recommended. I extended PyGeo's dataset to load in MNIST and perform your transformations on it and converted them to PyGeo's Data object. So I didn't have to change your models, I just changed the edges back to adjacency matrices before calling your model.

Thanks for your help and the advice on the lack of order for nodes being more significant than an arbitrary number of nodes. I had gotten stuck thinking it was the other way around and that was a bad train of thought for my current project.