twitter-research / cwn

Message Passing Neural Networks for Simplicial and Cell Complexes
MIT License
152 stars 23 forks source link

Format of TUDataset txt file #104

Closed icecream126 closed 2 years ago

icecream126 commented 2 years ago

Hi I'm currently working on running TUDatasets (REDDITBINRARY & NCI109) by running following command

sh ./exp/scripts/cwn-nci109.sh
sh ./exp/scripts/mpsn-redditb.sh

I checked REDDITBINARY & NCI109 from here but noticed that they have multiple types of txt file such as.. [ REDDITBINARY ]

[ NCI109_A ]

However I found that data/tu_utils.py ->def load_data(path, dataset, degree_as_tag): only takes one txt file.

## data/tu_utils.py
def load_data(path, dataset, degree_as_tag):
    """
        dataset: name of dataset
        test_proportion: ratio of test train split
        seed: random seed for random splitting of dataset
    """

    print('loading data')
    g_list = []
    label_dict = {}
    feat_dict = {}

    with open('%s/%s.txt' % (path, dataset), 'r') as f: ## <- only takes one txt file..
        n_g = int(f.readline().strip())
        for i in range(n_g):
            row = f.readline().strip().split()
            n, l = [int(w) for w in row]
            if not l in label_dict:
                mapped = len(label_dict)
                label_dict[l] = mapped
            g = nx.Graph()
            node_tags = []
            node_features = []
            n_edges = 0
            for j in range(n):
                g.add_node(j)
                row = f.readline().strip().split()
                tmp = int(row[1]) + 2
                if tmp == len(row):
                    # no node attributes
                    row = [int(w) for w in row]
                    attr = None
...

So I want to ask what txt file this code requires.. Thank you :)

crisbodnar commented 2 years ago

As explained in the README, the TUDatasets should be downloaded from https://www.dropbox.com/s/2ekun30wxyxpcr7/datasets.zip?dl=0

Let us now if you have further questions.