microsoft / tf2-gnn

TensorFlow 2 library implementing Graph Neural Networks
MIT License
371 stars 73 forks source link

Unify processing of adjacency lists across datasets #18

Closed kmaziarz closed 4 years ago

kmaziarz commented 4 years ago

I unified how adjacency lists are processed (adding self loops, backward edges, computing type_to_num_incoming_edges) across all datasets (JsonLGraphDataset, QM9Dataset, PPIDataset). This refactoring almost doesn't impact the behavior (the only difference is that self loops are now consistently added with edge type 0, which was not the case previously for PPIDataset).

If we wish to further refactor this, it will now be very easy to extract the handling of self loops and backward edges out of datasets, and into either the model or to GraphDataset.

mmjb commented 4 years ago

Thanks for this. There's a bunch of minor stylistic things, but apart from that, this looks great!

mmjb commented 4 years ago

The reason we factored this out for JsonLDataset is that we wanted to use the functionality in a different downstream dataset class (for the partial graphs in the JsonLTraceDataset); that was occasioned by the lack of an utility function like the one you created now. So my instinct here would be not to unify by creating more such functions in PPI/QM9, but inline the function calls there, and mark the existing _process_raw_adjacency_lists in JsonLDataset as deprecated (but not remove it, to not break existing subclasses).

On Sat, Apr 18, 2020 at 6:08 PM Krzysztof Maziarz notifications@github.com wrote:

@kmaziarz commented on this pull request.

In tf2_gnn/data/ppi_dataset.py https://github.com/microsoft/tf2-gnn/pull/18#discussion_r410723879:

             )

)

     return final_graphs
  • def _process_raw_adjacency_lists(

Yes, I intentionally extracted _process_raw_adjacency_lists to be the same in all three datasets, so that it's more unified and easier to refactor further. I think it's used only once in QM9Dataset as well. Would you recommend to inline it here and leave the other two?

— You are receiving this because your review was requested. Reply to this email directly, view it on GitHub https://github.com/microsoft/tf2-gnn/pull/18#discussion_r410723879, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKT62FPI64SYDCZY3HQFULRNHNALANCNFSM4MK5MBRA .