Open jerrywu65 opened 5 years ago
@jerrywu65
Hi, thanks for the comments.
Current model is transductive, which means test documents (without labels) are also included in the graph and training process.
Some inductive GCN models like fastGCN can make prediction without re-building the graph and re-training, please refer to https://github.com/yao8839836/text_gcn/issues/19, the code at the bottom should work, but the accuracy is lower than transductive version.
Some simple modification to the current code may also work, but I have not found it.
Thank you for your reply.I already add my test documents(without labels) to the graph.
The graph contains train documents(labeled) node,vocab node,and test documents(unlabeled) node.
But I don't know how to get the labels of test documents after i finish training the model, and that bothered me for a long time.
Really look forward to your reply : )
@jerrywu65
In train.py, test_labels (see line 145 and 150) should be the test labels.
line:139 test_cost, test_acc, pred, labels, test_duration = evaluate(features, support, y_test, test_mask, placeholders)
i thought y_test may be the label of test documents ,which i don't have. what should i use to act as y_test?
@jerrywu65
In build_grpah.py, line 299-309,
ty = [] for i in range(test_size): doc_meta = shuffle_doc_name_list[i + train_size] temp = doc_meta.split('\t') label = temp[2] one_hot = [0 for l in range(len(label_list))] label_index = label_list.index(label) one_hot[label_index] = 1 ty.append(one_hot) ty = np.array(ty) print(ty)
one_hot can be a all-zero vector, because you don't know "label = temp[2]", then y_test will be an all-zero matrix. but this could not affect the prediction results "test_pred" in train.py.
you can also fill "label = temp[2]" with the default first label, (e.g, '0' for MR dataset).
In this way, "test_pred" is always the prediction results, but the evaluation (line 152-157 in train.py) will not make sense, since "test_labels" in train.py are default labels or all zero.
Thank you very much. Your help is very important to me.
Hi Yao,
I applied your model to my own data,and it perform very well.This is a great model.
If i use this model to predict unlabeled documents' category,which part of the codes i should change?
Many thanks in advance.