mufeili commented 2 years ago

README file

I don't think you need "s" after "Implementation" in the title.
"the GNN model proposed in the paper" -> "the GNN experiment proposed in the paper"
'Dataset' -> 'dataset' for "The WikiCS Dataset is built from here and others are DGL's built-in Dataset"
If WikiCS is not available in DGL, it will be great to open a PR for contributing a built-in dataset.
If 4 is achieved, then likely we will no longer need the dataset_dir argument.
The default value of dataset should be amazon_photos rather than Amazon Photos.
"Convolutional layer sizes" -> "Convolutional layer hidden sizes"?
"Warmup period for learning rate" -> "Warmup period for learning rate scheduling"
"weights_dir" not used in main.py. The code block was commented out.
"Augmentations options" -> "Augmentation options"
"Accuracy Official code" -> "Accuracy Official Code"
"1 random dataset splits and model initializations" -> "1 random dataset split and model initialization"
"1 random model initializations" -> "1 random model initialization"

transforms.py

It will be nice to have the two transforms as DGL built-in transforms.

data.py

Why is there an s after train_mask, val_mask, but not test_mask?
Why did you re-create PPIDataset at L45?

main.py

Perhaps it's better to rename data to g for clarity at L86.

model.py

I think batch is always not None for PPI + GraphSAGE_GCN, right? If so, perhaps there's no need to handle the case where batch is None.

RecLusIve-F commented 2 years ago

The re-create PPIDataset at L45 is used for evaluation procedure. In training procedure, the train set and validation set is contacted. In order to separate three set, I have to re-create them.
batch is None in evaluation procedure, because I don't apply GraphDataLoader to evaluation. batch is only related to LayerNorm at L29.
I have also started working on PR for WikiCS and transform.
I have corrected all the grammatical and other mistakes you mentioned.
I have changed the mask to data. The reason for s after train_mask, val_mask is the number of them in WikiCS dataset is 20 and test_mask is 1. But in other dataset is not like this, so I decide to change the mask to data.
Remove the dataset_dir you mean download the dataset to the default directory?

mufeili commented 2 years ago

The re-create PPIDataset at L45 is used for evaluation procedure. In training procedure, the train set and validation set is contacted. In order to separate three set, I have to re-create them.

Got it. Thanks.

I have also started working on PR for WikiCS and transform.

Sounds great.

Remove the dataset_dir you mean download the dataset to the default directory?

Yes, particularly if you add WikiCS as a built-in DGL dataset.

RecLusIve-F commented 2 years ago

For NormalizeFeatures, should I set the parameter node_feat_names to indicate the transform is applied to which feat. In pyg implementation, they apply the transform to all feat.
For NodeFeaturesMasking, should I handle the case that the ndata['feat'].shape is (num_nodes,).

mufeili commented 2 years ago

For NormalizeFeatures, should I set the parameter node_feat_names to indicate the transform is applied to which feat. In pyg implementation, they apply the transform to all feat.

You can have two arguments to separately specify the ndata and edata feature name to normalize. If None, then all features will be normalized if applicable.

For NodeFeaturesMasking, should I handle the case that the ndata['feat'].shape is (num_nodes,)

I think you can assume the node features to be 2-dimensional for now.

RecLusIve-F / BGRL-dgl

Review 0501 #1

README file

transforms.py

data.py

main.py

model.py