1049451037 / GCN-Align

Code of the paper: Cross-lingual Knowledge Graph Alignment via Graph Convolutional Networks.
128 stars 27 forks source link

关系矩阵以及训练集划分 #3

Closed iceshzc closed 5 years ago

iceshzc commented 5 years ago
  1. 关系矩阵为什么是开根?并且似乎a_ij与a_ji不一定相等,即非对称? 2.训练集划分是否与ISWC17文章一样?我看代码似乎是按随机种子先shuffle,然后再取前百分之X做训练,后百分之100-X做测试。在种子为12306的情况下,这样能保证与ISWC17里的划分一样?
1049451037 commented 5 years ago

Hi, thank you for your concern. I will try to clarify my idea, but if you still have questions, feel free to comment.

  1. About the square root operation.

The idea of func and ifunc is to classify different relations by their statistical properties. For example, one-to-one relation has the func equalling 1.0 and one-to-two relation has the func equalling 0.5. To avoid the func being too small, we add a max function to it, which was the original operation of the paper.

As for the sqrare root operation, I just think it has the same effect of max... And now I updated the code since many researchers are puzzled by this operation.

  1. About the symmetry of A.

Yeah, it is not necessary to be symmetric.

  1. About the training set.

It is (with high probability) not same with JAPE. But I think if the work in ICWS17 randomly chose the seeds, the results will be statistically same.

iceshzc commented 5 years ago

OK, Thank you for your response. For Q.1 and Q.2,I think it's not a big matter. For Q.3, in fact, I also review the source code of JAPE. The concrete datasets for different ratio of training size are given in each dataset, eg., ../dbp15k/zh-en/0_3 means 30% portion of training data in zh-en. As a result, I move the 'sup_ent_ids' and 'ref_ent_ids' in 0_3 to replace the train_data and test_data in your code.

Unfortunately, when I set dim_se = 1000, dim_ae = 100 for zh-en/0_3, your method cannot reproduce the promising results, where JAPE reproduce the similar results which said in their paper. So I want to know how to tune the parameter for GCN-Align.

Thank you and wait for your response.

1049451037 commented 5 years ago

Yes, that's a good question. We split the training set into two parts: training and evaluation. After parameters tuned, we use those parameters to retrain with the full training set (since training seeds have great influence in this scenario).

More recently, we also found that the loss value is a good criterion for alignment, which does not need evaluation seeds. This idea is similar to MUSE. I am doing more research on it, which may be demonstrated in my bachelor degree paper later.

iceshzc commented 5 years ago

Hi, I try my best to reproduce your results. Unfortunately, I cannot reproduce the promising results. In fact, I just find that the "zh_en/ref_ent_ids" in your paper is not equal "ref_ent_ids" + "sup_ent_ids" in "zh_en/0_3" in JAPE. I wonder how you construct the whole labeled data where the first 10500 lines are the same with "zh_en/0_3/ref_ent_ids" and the last 4500 lines are not the same with "zh_en/0_3/sup_ent_ids". In the other hand, I also use your dataset and change the random_seed, it seems to reproduce the approximate results. One thing more, I don't know why you set the learning rate larger than 1 (eg. lr=20 in your code). Thank you for your notice and waiting for your response.

1049451037 commented 5 years ago

Actually, you just need to run python train.py to reproduce the results. With time going by, we find some tricks to get similar results with fewer parameters (e.g. lower dimension of SE), and the code has been modified a little bit. If you insist getting promising results with the parameters in the paper, the structure of the model should be modified a little bit as well. For example, the weight matrices were initialized by normal distribution in each layer, which caused some dead neurals which makes SE being 1000 dimension to get promising results.

For the sup_ent_ids, the reason we reorder the ids is that the entity pairs are merged in JAPE's code. But in our code, we don't need to merge them in data files. Therefore, we just split the merged entity pairs and give a new id for the extra entity. If you are doubt about the dataset, you can construct one by the initial dbp15k, or wait for some days if you don't mind, because when my bachelor degree paper finished, I will make a similar repo public with the dbp15k totally same as JAPE.

For the learning rate, I think it is ok to be any value, which is just determined by tuning. For example, if the best lr for loss = a+b is 1, then the best lr for loss = (a+b)/2 will be 2.