pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
21.29k stars 3.65k forks source link

How do I make my own data set to build GCN network #515

Closed xiaobai12345 closed 2 years ago

xiaobai12345 commented 5 years ago

I want to use GCN to build my own network, but I don't know how to make my own data set. For example, I run pytorch_geometric-master/examples/gcn.py and use the'Cora'data set here, but if I want to use my own data, I don't know how to use my own data to create a symbol. GCN network data sets, this makes me very troubled, do not know which God can answer my questions, we can communicate with each other, thank you.

xiaobai12345 commented 5 years ago

What I want to ask is who is doing something related to GCN? Can you see how you create your own data set for GCN training? Thank you. You can contact me

rusty1s commented 5 years ago

In PyG, we present a graph in aData object, which can hold any information of the graph in its attributes as a torch.tensor, e.g., sparse edge connectivity in edge_index, node feature matrix in x, and so on. I think that the Introduction and the Dataset Section should answer most of your questions.

xiaobai12345 commented 5 years ago

First,thank you for reply. Second, I want to know do you have a code to build a dataset and how to Train ,Test for GCN, Third, I really do not know how to set up the data format (from raw data to my train data format) to my gcn-net ( around hundreds data to handle) ,thanks

rusty1s commented 5 years ago

Well, the examples/gcn.py file is your best friend then :) For it to work on your data, you just need to swap out the data object. So for example:

from torch_geometric.data import Data

edge_index = torch.tensor([2, num_edges], dtype=torch.long)
node_feature_matrix = torch.tensor([num_nodes, num_features], dtype=torch.float)

# Select train, val and test nodes.
perm = torch.randperm(num_nodes)
train_mask = torch.zeros(num_nodes, dtype=torch.uint8)
train_mask[perm[0:100]] = 1
val_mask = torch.zeros(num_nodes, dtype=torch.uint8)
val_mask[perm[100:200]] = 1
test_mask = torch.zeros(num_nodes, dtype=torch.uint8)
test_mask[perm[200:-1]] = 1

data = Data(x=x, edge_index=edge_index, train_mask=train_mask, val_mask=val_mask, test_mask=test_mask)

# Now follow with the gcn.py code...
xiaobai12345 commented 5 years ago

Thanks, but my question is different from the gcn.py . Now let me state my question.

I have a dataset which i make it myself, now I want to use graph net to categorize my dataset ( it has 5 classes ),the dataset are some pictures, every picture have two human head, I need to categorize the orientation of the human head. So my idea is to compose a graph structure from the data of two heads in each picture, to construct a graph network and to classify the data. but I don't know how to add the label and head data to make a dataset for graph net to train and test. Could you help me or give some suggestions?

------------------ 原始邮件 ------------------ 发件人: "Matthias Fey"notifications@github.com; 发送时间: 2019年7月12日(星期五) 凌晨3:49 收件人: "rusty1s/pytorch_geometric"pytorch_geometric@noreply.github.com; 抄送: "齐白one"2272638562@qq.com; "Author"author@noreply.github.com; 主题: Re: [rusty1s/pytorch_geometric] How do I make my own data set tobuild GCN network (#515)

Well, the examples/gcn.py file is your best friend then :) For it to work on your data, you just need to swap out the data object. So for example: from torch_geometric.data import Data edge_index = torch.tensor([2, num_edges], dtype=torch.long) node_feature_matrix = torch.tensor([num_nodes, num_features], dtype=torch.float) # Select train, val and test nodes. perm = torch.randperm(num_nodes) train_mask = torch.zeros(num_nodes, dtype=torch.uint8) train_mask[perm[0:100]] = 1 val_mask = torch.zeros(num_nodes, dtype=torch.uint8) val_mask[perm[100:200]] = 1 test_mask = torch.zeros(num_nodes, dtype=torch.uint8) test_mask[perm[200:-1]] = 1 data = Data(x=x, edge_index=edge_index, train_mask=train_mask, val_mask=val_mask, test_mask=test_mask) # Now follow with the gcn.py code...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

xiaobai12345 commented 5 years ago

There are two heads in each picture. Each head has a unique label. There are five categories for head. Now the data and labels of two heads in each picture are made into data sets. That is to say, there are two nodes in a graph model. Each node has its own label. Now we need to use a graph. The model trains the data and predicts the category of each person's head in the test data set. This is the general task. my question is how to make the train dataset.

------------------ 原始邮件 ------------------ 发件人: "Matthias Fey"notifications@github.com; 发送时间: 2019年7月12日(星期五) 凌晨3:49 收件人: "rusty1s/pytorch_geometric"pytorch_geometric@noreply.github.com; 抄送: "齐白one"2272638562@qq.com; "Author"author@noreply.github.com; 主题: Re: [rusty1s/pytorch_geometric] How do I make my own data set tobuild GCN network (#515)

Well, the examples/gcn.py file is your best friend then :) For it to work on your data, you just need to swap out the data object. So for example: from torch_geometric.data import Data edge_index = torch.tensor([2, num_edges], dtype=torch.long) node_feature_matrix = torch.tensor([num_nodes, num_features], dtype=torch.float) # Select train, val and test nodes. perm = torch.randperm(num_nodes) train_mask = torch.zeros(num_nodes, dtype=torch.uint8) train_mask[perm[0:100]] = 1 val_mask = torch.zeros(num_nodes, dtype=torch.uint8) val_mask[perm[100:200]] = 1 test_mask = torch.zeros(num_nodes, dtype=torch.uint8) test_mask[perm[200:-1]] = 1 data = Data(x=x, edge_index=edge_index, train_mask=train_mask, val_mask=val_mask, test_mask=test_mask) # Now follow with the gcn.py code...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

rusty1s commented 5 years ago

So to summarize, you have a dataset of images, each image contains two heads and you want to predict the orientation of each head? I am not sure if deep graph learning is the way to go here. For me, it seems much more natural to find the objects in the image and to separately classify them afterwards. Relational Learning does only help if the orientation of the heads may relate to each other.

If you want to proceed with your idea, I suggest to create the graph structure on the fly. Given that your model has detected bounding boxes, you can then proceed to create a fully-connected graph, and pass messages along those edges using graph neural networks like NNConv. In the end, your dataset should just contain images, and the bounding boxes and downstream labels for each head.

xiaobai12345 commented 5 years ago

"So to summarize, you have a dataset of images, each image contains two heads and you want to predict the orientation of each head?" ---- yes, right, may be three,four,five heads in images and could you give me a demo for this task, such as how to make a dataset, construct a network, get a train and test process, Because I search on the internet, there is no such Course or materials to do this task, it makes me in trouble. Whatever thank you for anything!

------------------ 原始邮件 ------------------ 发件人: "Matthias Fey"notifications@github.com; 发送时间: 2019年7月29日(星期一) 下午2:46 收件人: "rusty1s/pytorch_geometric"pytorch_geometric@noreply.github.com; 抄送: "齐白one"2272638562@qq.com; "Author"author@noreply.github.com; 主题: Re: [rusty1s/pytorch_geometric] How do I make my own data set tobuild GCN network (#515)

So to summarize, you have a dataset of images, each image contains two heads and you want to predict the orientation of each head? I am not sure if deep graph learning is the way to go here. For me, it seems much more natural to find the objects in the image and to classify them afterwards. Relational Learning does only help if the orientation of the heads may relate to each other.

If you want to proceed with your idea, I suggest to create the graph structure on the fly. Given that your model has detected bounding boxes, you can then proceed to create a fully-connected graph, and pass messages along those edges using graph neural networks like NNConv. In the end, your dataset should just contain images, and the bounding boxes and downstream labels for each head.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

xiaobai12345 commented 2 years ago

您的邮件我已收到了哦,我会认真的看哦

        xzs
VijayIG commented 1 year ago

In PyG, we present a graph in aData object, which can hold any information of the graph in its attributes as a torch.tensor, e.g., sparse edge connectivity in edge_index, node feature matrix in x, and so on. I think that the Introduction and the Dataset Section should answer most of your questions.

Hey Rusty1s , As I am new to Graphical neural network I need some guidence for this ..!! I need to detect the table structure in a document can you guide my to do that ..!! Need to know where to start ..!! And need to build custom dataset

Can you reply with your mail id so that we can have clear discussion regarding this ..!!! This is mine vijayselvaraj13921@gmail.com

xiaobai12345 commented 1 year ago

您的邮件我已收到了哦,我会认真的看哦

        xzs