Closed maqy1995 closed 4 years ago
when I set the randomize
to False
, the shape of rows
will not change in the same batch_size. But the shape is changed again when I set Config.batch_size
to another value. The code is below(main.py line 114)
train_batcher = StreamBatcher(Config.dataset, 'train', Config.batch_size, randomize=True, keys=input_keys)
I guess the reason may be caused by StreamBatcher dropped the last batch.
Thanks for your question. Yes, I agree with you. The streamBatcher might drop the last batch. If that is the reason, we can manually add the lost rows into the sparse graph.
Thanks for your question. Yes, I agree with you. The streamBatcher might drop the last batch. If that is the reason, we can manually add the lost rows into the sparse graph.
any idea to manually add the lost rows into the sparse graph? or how to fix the StreamBatcher to get fully train set? The code in StreamBatcher is a bit hard to read for me...T T I try to modify the code below: https://github.com/JD-AI-Research-Silicon-Valley/SACN/blob/6f9831fdd02dec6b116e27a661aee483255c5f59/src/spodernet/spodernet/preprocessing/batching.py#L221 to:
self.num_batches = int(math.ceil(np.sum(config['counts']) / batch_size))
it works when set randomize=False
in different batch size, but if randomize=True
, this code fails again.
Thanks for your reply. You may consider the reminder of "np.sum(config['counts']) / batch_size". If the remainder is 0, it is fine. If not, you can add 1 to the num_batches. This package is coming from "https://github.com/TimDettmers/spodernet" for your reference.
It seems ConvE also has the same problem, see:
https://github.com/TimDettmers/ConvE/issues/2
https://github.com/TimDettmers/ConvE/issues/25
and, Quirks
in README with convE: https://github.com/TimDettmers/ConvE
looks like it's not easy to fix it...
another question is, why we need to do this: https://github.com/JD-AI-Research-Silicon-Valley/SACN/blob/6f9831fdd02dec6b116e27a661aee483255c5f59/models.py#L167 It seems to convert adj matrix A to symmetric matrix, but in my understanding, the train set seems use the reverse relation, so the adj matrix is symmetric when we first get it in main: https://github.com/JD-AI-Research-Silicon-Valley/SACN/blob/6f9831fdd02dec6b116e27a661aee483255c5f59/main.py#L152 Please correct me if I understand it incorrectly.
Hi. Thanks to find the links which I haven't noticed before. It is helpful to answer this question. For the new question, you can check the lines from 134 to 144 in main.py. If I remember correctly, the graph created there is a nonsymmetric matrix. So we need to do this step.
Excuse me again...
When I use FB15k-237, I get a similar result with the paper, but when I use WN18RR, the Hits@1 is much less than the value in the paper.
I use the hyper-parameters followed by:
for WN18RR dataset, set dropout to 0.2, number of kernels to 300, learning rate to 0.003, and embedding size to 200 for SACN.
I run 1000 epochs, the Hits @1 = 0.34
(the value in paper is 0.43), Hits @3 = 0.46
, Hits @10 = 0.52
, and MRR=0.41
.
What is the possible reason?
Hi. Thanks for your question. You should be able to reproduce similar results. You can tune the hyperparameters using the recommended values in https://github.com/JD-AI-Research-Silicon-Valley/SACN/blob/master/src/spodernet/spodernet/utils/global_config.py. If you can get better results, look forward to your sharing of the hyperparameters.
Excuse me again... When I use FB15k-237, I get a similar result with the paper, but when I use WN18RR, the Hits@1 is much less than the value in the paper. I use the hyper-parameters followed by:
for WN18RR dataset, set dropout to 0.2, number of kernels to 300, learning rate to 0.003, and embedding size to 200 for SACN.
I run 1000 epochs, the
Hits @1 = 0.34
(the value in paper is 0.43),Hits @3 = 0.46
,Hits @10 = 0.52
, andMRR=0.41
. What is the possible reason?
Excuse me, could you please tell me the hyperparameters you use for FB15k-237? I just run the code the author releases, but get Hits@10 0.518, MRR 0.340 at most during 1000 epochs. In other words, I can't reproduce the result reported. (0.54 and 0.35)
Excuse me again... When I use FB15k-237, I get a similar result with the paper, but when I use WN18RR, the Hits@1 is much less than the value in the paper. I use the hyper-parameters followed by:
for WN18RR dataset, set dropout to 0.2, number of kernels to 300, learning rate to 0.003, and embedding size to 200 for SACN.
I run 1000 epochs, the
Hits @1 = 0.34
(the value in paper is 0.43),Hits @3 = 0.46
,Hits @10 = 0.52
, andMRR=0.41
. What is the possible reason?Excuse me, could you please tell me the hyperparameters you use for FB15k-237? I just run the code the author releases, but get Hits@10 0.518, MRR 0.340 at most during 1000 epochs. In other words, I can't reproduce the result reported. (0.54 and 0.35)
sorry, I did not save the hyperparameters I used. In my memory, I use the default hyperparameters(I'm not sure...), but I change the batch size to 4096
. I get the best MRR int test set around epoch 2296
. In this epoch, the total result is : TOP1: 0.2598 TOP3:0.383964 TOP10:0.5357 MRR:0.35063.
Hi. Thanks for your sharing. It is so great to get your feedback. : )
Excuse me again... When I use FB15k-237, I get a similar result with the paper, but when I use WN18RR, the Hits@1 is much less than the value in the paper. I use the hyper-parameters followed by:
for WN18RR dataset, set dropout to 0.2, number of kernels to 300, learning rate to 0.003, and embedding size to 200 for SACN.
I run 1000 epochs, the
Hits @1 = 0.34
(the value in paper is 0.43),Hits @3 = 0.46
,Hits @10 = 0.52
, andMRR=0.41
. What is the possible reason?Excuse me, could you please tell me the hyperparameters you use for FB15k-237? I just run the code the author releases, but get Hits@10 0.518, MRR 0.340 at most during 1000 epochs. In other words, I can't reproduce the result reported. (0.54 and 0.35)
sorry, I did not save the hyperparameters I used. In my memory, I use the default hyperparameters(I'm not sure...), but I change the batch size to
4096
. I get the best MRR int test set aroundepoch 2296
. In this epoch, the total result is : TOP1: 0.2598 TOP3:0.383964 TOP10:0.5357 MRR:0.35063.
Thanks for your reply! I will have a try.
Excuse me again... When I use FB15k-237, I get a similar result with the paper, but when I use WN18RR, the Hits@1 is much less than the value in the paper. I use the hyper-parameters followed by:
for WN18RR dataset, set dropout to 0.2, number of kernels to 300, learning rate to 0.003, and embedding size to 200 for SACN.
I run 1000 epochs, the
Hits @1 = 0.34
(the value in paper is 0.43),Hits @3 = 0.46
,Hits @10 = 0.52
, andMRR=0.41
. What is the possible reason?Excuse me, could you please tell me the hyperparameters you use for FB15k-237? I just run the code the author releases, but get Hits@10 0.518, MRR 0.340 at most during 1000 epochs. In other words, I can't reproduce the result reported. (0.54 and 0.35)
sorry, I did not save the hyperparameters I used. In my memory, I use the default hyperparameters(I'm not sure...), but I change the batch size to
4096
. I get the best MRR int test set aroundepoch 2296
. In this epoch, the total result is : TOP1: 0.2598 TOP3:0.383964 TOP10:0.5357 MRR:0.35063.Thanks for your reply! I will have a try.
Recently, I try to remove spodernet
and implement WGCN by DGL, but I can't produce the paper result either(T T). This is my implementation and result: https://github.com/maqy1995/sacn_dgl
Excuse me again... When I use FB15k-237, I get a similar result with the paper, but when I use WN18RR, the Hits@1 is much less than the value in the paper. I use the hyper-parameters followed by:
for WN18RR dataset, set dropout to 0.2, number of kernels to 300, learning rate to 0.003, and embedding size to 200 for SACN.
I run 1000 epochs, the
Hits @1 = 0.34
(the value in paper is 0.43),Hits @3 = 0.46
,Hits @10 = 0.52
, andMRR=0.41
. What is the possible reason?Excuse me, could you please tell me the hyperparameters you use for FB15k-237? I just run the code the author releases, but get Hits@10 0.518, MRR 0.340 at most during 1000 epochs. In other words, I can't reproduce the result reported. (0.54 and 0.35)
sorry, I did not save the hyperparameters I used. In my memory, I use the default hyperparameters(I'm not sure...), but I change the batch size to
4096
. I get the best MRR int test set aroundepoch 2296
. In this epoch, the total result is : TOP1: 0.2598 TOP3:0.383964 TOP10:0.5357 MRR:0.35063.Thanks for your reply! I will have a try.
Recently, I try to remove
spodernet
and implement WGCN by DGL, but I can't produce the paper result either(T T). This is my implementation and result: https://github.com/maqy1995/sacn_dgl
I just got similar results as yours... At most 0.526, no higher than 0.53.
Hi, thanks for the sharing. DGL is a great and convenient tool. I'd like to recommend your implementation to other researchers.
In main.py line 134-148,
the
rows
orcolumns
shape is changed when we run the process twice, which means that train set is mutable?