entslscheia / GGNN_Reasoning

PyTorch implementation for Graph Gated Neural Network (for Knowledge Graphs)
47 stars 8 forks source link

raise RuntimeError('each element in list of batch should be of equal size') RuntimeError: each element in list of batch should be of equal size #2

Open alice-cool opened 3 years ago

alice-cool commented 3 years ago

I think maybe the irregular list A in the dataset.py the __get_item__() makes the mistake.

alice-cool commented 3 years ago

in dataset.py

A = [[] for k in range(self.n_node)] 
for triple in data[i]["graph"]:
        A[triple[0]].append((triple[1], triple[2]))

it is confusing. Because the first line says the number of elements A is equal to self.n_node( the maximum number of nodes in one graph) But the last two lines say that the len of A will be modified by the id of node, such as triple[0], that is the maximum of id of node.

alice-cool commented 3 years ago
    @staticmethod
    def find_max_syz(data, num):
        max_syz = 0
        for i in range(len(data)):
            listnum = [0 for k in range(num)]
            for j in range(len(data[i]['graph'])):
                listnum[data[i]['graph'][j][0]] = listnum[data[i]['graph'][j][0]]+1
            if max_syz < max(listnum):
                max_syz = max(listnum)

        return max_syz

    @staticmethod
    def find_max_node_id(data):
        max_num_id = 0
        for i in range(len(data)):
            for triple in data[i]["graph"]:
                if triple[0] > max_num_id:
                    max_num_id = triple[0]
                if triple[2] > max_num_id:
                    max_num_id = triple[2]
        return max_num_id

self.n_node_types = self.find_max_node_id(data)

            A = [[] for k in range(self.n_node_types)]

            for triple in data[i]["graph"]:
                A[triple[0]].append((triple[1], triple[2]))

            print("syz:",self.syz_num)
            #padding syz
            for i in range(len(A)):
                if A[i]==[]:
                    for k in range(self.syz_num):
                        A[i].append((0, 0))
                elif len(A[i]) < self.syz_num:
                    cc = self.syz_num - len(A[i])
                    for k in range(cc):
                        A[i].append((0, 0))

            A_list.append(A)
            data_idx.append(i)
entslscheia commented 3 years ago

in dataset.py

A = [[] for k in range(self.n_node)] 
for triple in data[i]["graph"]:
        A[triple[0]].append((triple[1], triple[2]))

it is confusing. Because the first line says the number of elements A is equal to self.n_node( the maximum number of nodes in one graph) But the last two lines say that the len of A will be modified by the id of node, such as triple[0], that is the maximum of id of node.

Hi! Sorry about the trouble. I have not been maintaining this repo for like two years, so I don't actually remember this kind of debugging-level details at this point. I would suggest you trying some toy data under the data/ directory and see how it works. But for the code you pasted here, I don't quite think the claim "last two lines say that the len of A will be modified by the id of node" makes sense. The last two lines only update the content of A, without changing the size of A (i.e., len(A)). Hope it helps!

alice-cool commented 3 years ago

Thanks for   your help I will try

---Original--- From: "Yu @.> Date: Sat, Mar 20, 2021 11:09 AM To: @.>; Cc: @.**@.>; Subject: Re: [entslscheia/GGNN_Reasoning] raise RuntimeError('each element in list of batch should be of equal size') RuntimeError: each element in list of batch should be of equal size (#2)

in dataset.py A = [[] for k in range(self.n_node)] for triple in data[i]["graph"]: A[triple[0]].append((triple[1], triple[2]))
it is confusing. Because the first line says the number of elements A is equal to self.n_node( the maximum number of nodes in one graph) But the last two lines say that the len of A will be modified by the id of node, such as triple[0], that is the maximum of id of node.

Hi! Sorry about the trouble. I have not been maintaining this repo for like two years, so I don't actually remember this kind of debugging-level details at this point. I would suggest you to try some toy data under the data directory and see how it works. But for the code you pasted here, I don't quite think the claim "last two lines say that the len of A will be modified by the id of node" makes sense. The last two lines only update the content of A, without changing the size of A (i.e., len(A)). Hope it helps!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

alice-cool commented 3 years ago

I run my modified code that throw the memory error. SO my way will produce a big sparse A list. I said the len(A)will be modified because if evey graph at most has 3 edges. if using you code, it said the A list initialize the length as 3. If the set of samples of graph data includes 100 different nodes, that is the node of id will be up to 100. So in the loop because A[triple[0]], so the triple[0] will be 100. So len(A)will be replaced by 100. It is just my opinion. Thanks for your help

for triple in data[i]["graph"]:
        A[triple[0]].append((triple[1], triple[2]))