Closed sachinsharma9780 closed 2 years ago
Hi, interesting approach, but it is quite hard for me to evaluate your model. Personally, I would start with a simpler baseline and only add more sophisticated models if it later turns out to improve performance. If your model does not learn anything, it is likely that
Hmmm.
corrupted? I have extracted features from images like momentum, mean, max, etc for every segment in image. How to visualize it, u mean common data visualization techniques. Evaluation u mean I should give these extracted features to MLP and then see the result. For edge_connectivity or to get the edge_index of each segmented image i have used the following code with connectivity 8: def segmentation_adjacency(segmentation, connectivity=4): """Generate an adjacency matrix out of a given segmentation."""
assert connectivity == 4 or connectivity == 8
idx = np.indices(segmentation.shape) ys = npg.aggregate(segmentation.flatten(), idx[0].flatten(), func='mean') xs = npg.aggregate(segmentation.flatten(), idx[1].flatten(), func='mean') ys = np.reshape(ys, (-1, 1)) xs = np.reshape(xs, (-1, 1)) points = np.concatenate((ys, xs), axis=1)
nums, mass = np.unique(segmentation, return_counts=True) n = nums.shape[0]
tmp = np.zeros((n, n), np.bool)
a, b = segmentation[:-1, :], segmentation[1:, :] tmp[a[a != b], b[a != b]] = True
a, b = segmentation[:, :-1], segmentation[:, 1:] tmp[a[a != b], b[a != b]] = True
if connectivity == 8: a, b = segmentation[:-1, :-1], segmentation[1:, 1:] tmp[a[a != b], b[a != b]] = True
a, b = segmentation[:-1, 1:], segmentation[1:, :-1]
tmp[a[a != b], b[a != b]] = True
result = tmp | tmp.T result = result.astype(np.uint8) adj = sp.coo_matrix(result)
return adj
2) This might be the case but I have used Higher-order GCN which I think kind of general model for graph inputs. But don't know exactly. Spatial info of node u mean edge_index, which I have calculated with above code or edge_attr? If edge_attr then i guess it was not needed by this algo so i did not calculate it also I guess it is not mandatory to give edge_attr info to model.
3)what are face numerical instabilities. Are they compulsory to generate?
Below i am attaching some results of epoch:
Thank You! :)
ok thanks! Since data.pos is also needed by NNConv model. I am having difficulty in understanding data.pos in graph. In the documentation it says data.pos: Node position matrix with shape [num_nodes, num_dimensions]. But what does 2nd dimension( "num_dimenstions" ) means here , as far as i understood dimensions in graph are the number of node features but herer it seems different could u please enlighten me with this. Also how can I generate these from the graph, is there any predefined function like cartesian transform.
Below I am attaching the result of what u said to draw a graph on img by taking a mean of segments. To me it seems fine. What u say?\ Note: Img below is a retina of eye.
data.pos
denotes the position of nodes in Euclidean space, e.g., for processing point clouds. The position differs from the input features since you generally do not want to input absolute coordinates into your model. In your case, data.pos
should be a [N,2] tensor which holds for each node the mean coordinate of its segment. You already do this in the picture above.
Ahh! I get it. So thank you for the quick responses. I will analyze all the points said by you and will try to use the NNconv algo for my input data. I am keeping this issue open for my further questions! Thank You for ur help!
Hi,
I did the changes suggested by you like adding spatial information and so on . Now the loss and accuracy are changing, But it is giving very bad results like per-class accuracy: tensor([0.3412, 0.6336])(Binary class problem(0, 1)). Loss comes down to 0.70 but test_acc still remains bad(49%).
Could it be the case that my features are noisy or not relevant for differentiating 2 classes of my Medical images dataset? Because these features I extracted using your Master thesis code in which form_features_extraction.py file extracts 41 features from each segment in image.
Could you please give me some reference on how to extract features efficiently from images so that Graph neural network(NN_conv) can find meaningful patterns from these features.
Thanks!
2 and 3: I cannot say that. You are free to try any other features which may help your model. Those are just the ones I tried in my thesis, although for MNIST the color feature is already sufficient.
Have you tried processing your data using traditional CNNs like ResNet? How does those models perform? In addition, this paper may be of interest to you.
Hmm.. ok. No, I haven't tried using ResNet on my dataset. How this will help me? Also, I guess resnet is not suitable for Medical Images classification. Thanx for the paper I will read it, any reference to the implementation of this paper.
Well, the superpixel approach is just another technique to process images, so i do not see how you cannot apply traditional CNNs for your task. In the end, accuracy should be equal or even better. Curious why you think this is not suitable for medical images?
I do not think there is an open-source reference implementation, but you can contact the author and ask for it.
@rusty1s Thanks for pointing to my paper. I cannot release full code due to restrictions where I did this work, but I'll be happy to clarify details. @sachinsharma9780 I also have different pieces in my github: extraction of superpixels here and an example of learning multigraphs here. From these pieces and from the formulas in our paper it should be possible to build a similar model for your task.
But, Matthias' thesis is so awesome, so I would just use his code. You can then try to add hierarchical and other relationships from my paper to improve results.
Matthias, you should publish your thesis in English as a journal paper or at lease a blog post. I should have cited it in my paper, but I wasn't aware of your work at that moment. Sorry about that.
I agree with Matthias that @sachinsharma9780 should first try CNN. People use ImageNet pretrained models for medical imaging even though ImageNet is very different, and it usually works great for mysterious reasons.
Sorry for my off topic comments :) If you need further help, please contact me.
ok, I was thinking with respect to fine tuning of resnet model for medical imgs since it is trained on imageNet. But without fine tuning we can use it.
@rusty1s Thanks for pointing to my paper. I cannot release full code due to restrictions where I did this work, but I'll be happy to clarify details. @sachinsharma9780 I also have different pieces in my github: extraction of superpixels here and an example of learning multigraphs here. From these pieces and from the formulas in our paper it should be possible to build a similar model for your task.
But, Matthias' thesis is so awesome, so I would just use his code. You can then try to add hierarchical and other relationships from my paper to improve results.
Matthias, you should publish your thesis in English as a journal paper or at lease a blog post. I should have cited it in my paper, but I wasn't aware of your work at that moment. Sorry about that.
I agree with Matthias that @sachinsharma9780 should first try CNN. People use ImageNet pretrained models for medical imaging even though ImageNet is very different, and it usually works great for mysterious reasons.
Sorry for my off topic comments :) If you need further help, please contact me.
Thanks for the reply. I'll give a look at the information provided by you. will contact you if have any doubts.
Thank you for your reply @bknyaz. It was a pleasure to read your paper. Actually, my master thesis was the methodical foundation of our SplineCNN paper, and we already applied it on superpixels in our MNIST experiment.
However, I do believe that a classical CNN is much more suited for those tasks (and is sadly way faster although it has a lot more data to consume), especially with the power of pre-training on ImageNet. Although ImageNet is quite different as you said, using the pre-trained model weights as initialization is, IMO, understandably more powerful than just using random initialization (even when applied to very different tasks), because the model has already learned so much about general vision. It can then proceed to reuse its knowledge for more specific tasks.
One thing I forget to tell is that my dataset is really small, like it has only 580 images. So I am performing experiments on that. Can it also be the reason that I am not getting good results.
Impossible to tell, you need baselines like MLPs and CNNs to judge the performance of your model.
ok, cool. I will create a baseline and then compare the performances of both models CNN and Graph NNConv.
Hi, I implemented NNConv on my binary dataset. Now loss is decreasing but training accuracy fluctuates around 50% and with 100s of epoch model's training accuracy is not increasing, it stables around 50%. I dont know what is happening, I changed learning rates dynamically, increased number of neurons in fully connected layers but still having the same result. Below is the modified NN conv which i am using:
def normalized_cut_2d(edge_index, pos): row, col = edge_index edge_attr = torch.norm(pos[row] - pos[col], p=2, dim=1) return normalized_cut(edge_index, edge_attr, num_nodes=pos.size(0))
class Net(nn.Module): def init(self): super(Net, self).init()
# conv1 wts = d.num_features(in_channels)*32(output_channels), 32 filter size
# nn.linear(in_features, out_features)
# nn.sequential maps edge_features [-1, num_edge_features] to shape [-1, in_channels*out_channels]
# [-1, in_channels*out_channels] = [-1, 41*32]
nn1 = nn.Sequential(nn.Linear(2, 10), nn.ReLU(), nn.Linear(10, d.num_features*128))
self.conv1 = NNConv(d.num_features, 128, nn1, aggr='mean')
#print(self.conv1)
# conv2 wts = 32*64
nn2 = nn.Sequential(nn.Linear(2, 10), nn.ReLU(), nn.Linear(10, 128*256))
self.conv2 = NNConv(128, 256, nn2, aggr='mean')
#print(self.conv2)
self.fc1 = torch.nn.Linear(256, 512)
self.fc2 = torch.nn.Linear(512, d.num_classes)
#self.bn1 = torch.nn.BatchNorm1d(256)
#self.bn2 = torch.nn.BatchNorm1d(512)
def forward(self, data):
data.x = F.elu(self.conv1(data.x, data.edge_index, data.edge_attr))
#print(data.x.shape)
weight = normalized_cut_2d(data.edge_index, data.pos)
cluster = graclus(data.edge_index, weight, data.x.size(0))
data = max_pool(cluster, data, transform=T.Cartesian(cat=False))
data.x = F.elu(self.conv2(data.x, data.edge_index, data.edge_attr))
weight = normalized_cut_2d(data.edge_index, data.pos)
cluster = graclus(data.edge_index, weight, data.x.size(0))
x, batch = max_pool_x(cluster, data.x, data.batch)
x = global_mean_pool(x, batch)
x = F.elu(self.fc1(x))
#x = F.dropout(x, training=self.training)
#return torch.sigmoid(self.fc2(x)).squeeze(1)
return F.log_softmax(self.fc2(x), dim=1)
#print(x.shape)
#print(x)
#return x
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = Net().to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.001)
crit = torch.nn.CrossEntropyLoss()
def train(epoch): model.train() loss_all = 0.0 train_correct = 0 if epoch == 25: for param_group in optimizer.param_groups: param_group['lr'] = 0.0001
if epoch == 100:
for param_group in optimizer.param_groups:
param_group['lr'] = 0.00001
param_group['momentum'] = 0.6
if epoch == 150:
for param_group in optimizer.param_groups:
param_group['lr'] = 0.000001
param_group['momentum'] = 0.7
#if epoch == 25
for data in train_loader:
data = data.to(device)
optimizer.zero_grad()
data.y = torch.tensor(data.y, dtype=torch.long).to(device)
#loss = crit(model(data), data.y)
#loss.backward()
output = model(data)
#data.y = torch.tensor(data.y, dtype=torch.float).to(device)
#label = data.y.to(device)
#print('o/p:', type(output), output)
#print('label: ', type(data.y), data.y)
loss = crit(output, data.y)
loss.backward()
# loss.item() gets the scalar value held in the loss.
loss_all += data.num_graphs * loss.item()
optimizer.step()
train_correct += output.max(1)[1].eq(data.y).sum().item()
#scheduler.step(loss_all)
return loss_all/len(train_data_list) , train_correct / len(train_data_list)
def test(): model.eval() correct = 0 with torch.no_grad():
for data in test_loader:
data = data.to(device)
pred = model(data).max(1)[1]
#pred = model(data)
print('pred', pred)
print('lbl', data.y)
for t, p in zip(data.y.view(-1), pred.view(-1)):
confusion_matrix[t.long(), p.long()] += 1
#data.y = torch.tensor(data.y, dtype=torch.float).to(device)
data.y = torch.tensor(data.y, dtype=torch.long)
data.y = data.y.to(device)
correct += pred.eq(data.y).sum().item()
return correct / len(test_data_list)
loss_plot = [] epoch_plot = [] test_acc_plot = [] train_acc_plot = [] for epoch in range(0, 400): loss, train_acc = train(epoch)
train_acc_plot.append(train_acc)
loss_plot.append(loss)
epoch_plot.append(epoch)
test_acc = test()
test_acc_plot.append(test_acc)
print('Epoch: {:02d}, Train_Loss: {:.5f}, Train_acc: {:.4f}, Test_acc: {:.4f}'.format(epoch, loss, train_acc, test_acc))
Is there any problem with my network? Preprocessing pipeline also seems fine to me. In image, u can see my training acc w.r.t epochs
It would be easier to read your code if it was formatted correctly. Your code looks mostly correct to me, but for binary classification you should use a one-dimensional output and the BCEWithLogits loss. You can also comment out the graclus pooling calls and see if this improves the model. How do the baselines perform?
I am sorry for Non-Clean code, I was just trying different things. Initially, I had used sigmoid at the last layer with BCE loss but the model was giving constant result. Then I switched to softmax but I think it doesn't make much difference. I have tested my dataset with resnet-50 arch, there also I am getting around 56% test accuracy but the thing is here training accuracy is increasing steadily like going close to 70% in 10 epochs unlike in NNconv where training accuracy fluctuates so much as u can see from the above graph.
I have also seen the images of two classes in my dataset they more or less look same(Do u think this can be the reason for bad results). But the thing is with each corresponding image data I also have metadata of each patient(like age, sex, ethnicity etc) and I haven't incorporated this data into the network, only trying with images. Maybe this metadata can make some difference, I don't know yet. What's your take?
Hi, Any Suggestions on the above comment?
Sorry, i missed your previous comment. Adding metadata to your model sounds reasonable. You can do this by concatenating the features to your CNN output.
If your images look the same though, and you cannot even distinguish them as an expert, I guess it is quite hard for a CNN to distinguish them as well. Think about what may indicate the separation of your classes, and how a model may be able to learn to recognize those to improve your model.
Thanx for the reply. So can u point out to any reference paper/code to this idea of "Adding metadata to your model sounds reasonable, You can do this by concatenating the features to your CNN output" will be really helpful?
No specific paper in mind. I guess it is just the standard approach to add global information to your CNN.
Sorry, i missed your previous comment. Adding metadata to your model sounds reasonable. You can do this by concatenating the features to your CNN output.
If your images look the same though, and you cannot even distinguish them as an expert, I guess it is quite hard for a CNN to distinguish them as well. Think about what may indicate the separation of your classes, and how a model may be able to learn to recognize those to improve your model.
Hi, Sorry for the late reply was busy with exams! Regarding your above comment on concatenating features to CNN output, So where and how I should concatenate metadata with features embedding learned from CNN.
My idea is: I have two types of data image data and metadata, Now first image data is passed into CNN and from their feature embeddings are learned then I will somehow store these optimal feature embeddings learnt from CNN. Now I will represent these feature embedding as node features in graph with metadata represented as edge weights between nodes which will result in kind of semi-supervised learning using a simple gcn network. Do u think this approach seems reasonable?
Thanks!
Hi, I do not see how you can create a graph out of your metadata (e.g. age or sex). How would your graph look like? I believe you can simply learn the CNN embeddings end-to-end, but before making the final predictions via an MLP, you concatenate the metadata to your embeddings:
CNN -------+
|
+---> MLP ---> output
|
Metadata --+
I am trying to do something like in image but the thing is here features extracted from the image are handcrafted particularly for that problem. Edge weights are basically calculated based on the similarity between the 2 nodes using meta data and then simple gcn is applied which makes the problem as semi-supervised.
My idea is instead of handcrafting features from image what if we learn feature embedding from CNN and represent those feature embeddings as nodes and rest of the thing remains the same as above.
CNN------Feature_Embeddings------Represented as Nodes in graph--- |
create edge weights
between nodes by
applying similarity
measure--------------------------->GCN----->o/p
between metadata
|
Metadata------------------------------------------------------------------------------
Reference: https://arxiv.org/pdf/1806.01738.pdf
Interesting idea. I guess you can do this. While you technically can train this network end-to-end, it might be easier to just use the output of a pre-trained CNN.
Hmmm. Yeah, I was also thinking of using pretrained CNN. Thanx. I will try this approach and let u know about the results.
If i am using pretrained(on imagenet) resnet-50(which is giving me good results) arch. then from which layer i should extract the features. Because usually parameters of initial layers are freezed and we only train fully connected layers. So do I need to extract feature embedding from first layer of full connection?
I suggest to use the obtained feature embeddings before the final fully connected prediction takes place. But I am no expert on this, so I suggest you to also consult relevant literature.
ok, Thanx for the help!
Hi, One small doubt regarding data handling of graphs, I am really confused here that when we are adding edge attributes in data.edge_attr , how data.edge_attr knows which edge attr belongs to two particular nodes. For ex. let's say we have a complete graph with 3 nodes having 3 edges and I want find cosine similarity between feature vector of each node with other than how data.edge_attr encode this info between every pair of node knowing that this edge feature belongs to these 2 nodes because data.edge_attr only takes edge feature as input in shape of [num_edges, num_edge_features].
Thanak You!
edge_attr
follows the ordering defined in edge_index
. For example, the first entry in edge_attr
corresponds to the edge defined in edge_index[:, 0]
.
Hmm... ok so if this is the case then in case of above example of complete graph with 3 nodes(undirected, 3 edges) do we need to provide edge attr info for every entry(which will be basically 6 entries) for ex for these 6 pairs: source = [0, 0, 1, 1, 2, 2] , target=[1, ,2, 0, 2, 0, 1] creating edge_attr size [6, 1] but then we have ony 3 edges in graph?
Yes :)
ok! Thnx!
hi,
while running GCN algo for my problem i got following error:
RuntimeError: Subtraction, the -
operator, with a bool tensor is not supported. If you are trying to invert a mask, use the ~
or bitwise_not()
operator instead.
I followed the instruction given in this issue: https://github.com/rusty1s/pytorch_geometric/issues/616 but still cannot resolve it via updating torch_cluster. So is there any other way except cloning the repository?
mask.sum().item() / mask.size(0)
yields the split percentage.ok, in the second point we still need to give feature, label and edge_attribute for every node and what masking does is basically disable that node and we test our performance on that masked node(So its just randomly do the masking of nodes for train, val and test set).
I am not sure what you mean with "disable that node". The masking tensor just defines on which nodes we want to train our parameters (train_mask
), validate them (val_mask
) and test them (test_mask
).
hi, I created a graph data structure in a similar way mentioned in your documentation for input to GCN algo. It is working fine when I am not including edge_weights/edge_attr but when I try to use edge_weights it is giving me below error. I try to debug it but not successful. Any help! So following is the error:
edge_index shape: torch.Size([2, 56])
edge_attr shape: torch.Size([56, 1])
label shape torch.Size([8])
graph data Data(edge_attr=[56, 1], edge_index=[2, 56], test_mask=[8], train_mask=[8], val_mask=[8], x=[8, 512], y=[8])
train_massk tensor([ True, True, True, True, True, False, False, False])
test_mask tensor([ True, True, False, False, False, False, False, False])
val_mask tensor([ True, True, False, False, False, False, False, False])
ndoe features 512
train_mask sum 5
val_mask sum 2
test_mask sum 2
Traceback (most recent call last):
File "create_graph_data_structure.py", line 141, in
Try out edge_weight.flatten()
.
Thank You very much, it is working now!.
Hi, I want to know what is the data.pos for the graph like above. If I am making a graph by my own with node features and edge features then I need to compute data.pos, in case of an image we basically use the centroid of a segment but what if there is a graph like above? should it be data.pos = [0, 1, 2]^T?
Thanx
The data.pos
attribute is optional and should only be used if nodes have a position in Euclidean space (like point clouds or superpixel centroids). You can simply omit in the example above.
ok, actually i was using NNconv for Graph classification and there in normalized cut function we are asked to input data.pos so thats why i asked but i guess in my case i have 1 dimensional edge attr and i can just pass number of nodes in num_pos argument like here ( for ex normalized_cut(edge_index, edge_attr, num_nodes=64).
one more question which graph algorithm is good for Graph classification problem. My graphs has node features and 1-D edge attrs features. one i know is NN Conv(edge conditioned conv). Can u suggest some others?
❓ Questions & Help
Hi I am using enzymes_topk_pool(ETP) algorithm for Medical Image classification. I have created features out of Images and converted them into data format accepted by pytorchg data loader. But after that when I try to give these features to the ETP algo , model is not able to learn anything. Training and test loss doesn't change from 1st epoch until the end. Everything remains constant. More info: Its binary classification problem. Below i am attaching the small script so that u get an idea.
class Net(torch.nn.Module): def init(self): super(Net, self).init()
41 = number of features
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = Net().to(device) optimizer = torch.optim.Adam(model.parameters(), lr=0.001) scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, verbose=True)
crit = torch.nn.BCELoss() import pdb def train(epoch): model.train()
from sklearn.metrics import roc_auc_score def evaluate(loader): model.eval()
for epoch in range(1, 201): loss = train(epoch) train_auc = evaluate(train_loader) test_auc = evaluate(test_loader)
train_acc = test(train_loader)
Note: For feature extraction from Images I have used ur Master thesis code. I have just used Form_feature_extration file and adjacency.py file but not feature_selection and coarsening file. Are they also needed to create features? Because currently, I have 41 features for every node in the image.
Thanks in advance!