djz233 / D-DGCN

source code of Orders Are Unwanted: Dynamic Deep Graph Convolutional Network for Personality Detection (AAAI2023)
21 stars 3 forks source link

why baseline dosen't contain CGTN? #3

Closed hhhhhy60 closed 11 months ago

hhhhhy60 commented 1 year ago

it seams CGTN‘s have higher performance in kaggle dataset. https://www.ijcai.org/proceedings/2022/0633.pdf

warrior-yyyan commented 1 year ago

it seams CGTN‘s have higher performance in kaggle dataset. https://www.ijcai.org/proceedings/2022/0633.pdf

In the comparative data of CGTN's work, the data of Transformet-MD has obvious errors, and the author refuses to disclose the code, sends emails or even does not reply. I strongly doubt the authenticity of his work!

hhhhhy60 commented 1 year ago

it seams CGTN‘s have higher performance in kaggle dataset. https://www.ijcai.org/proceedings/2022/0633.pdf

In the comparative data of CGTN's work, the data of Transformet-MD has obvious errors, and the author refuses to disclose the code, sends emails or even does not reply. I strongly doubt the authenticity of his work!

@warrior-yyyan I have the same suspicion, but mainly because of its unclear words of model hyper-parameters and Implementation Details. Of course, I did not get a replay too. Do you mean Transformers MD's result on Essay dataset has some problems in CGTN's paper? The result on Kaggle seems just copied from Transformers MD's paper. Could it be that this paper's author(D-DGCN) has the same suspicion, so they didn't list CGTN in baseline? I wonder how the author explains to the reviewer, or the reviewer just simply doesn't know about CGTN.

hhhhhy60 commented 1 year ago

@warrior-yyyan By the way, have you replicated the results of this paper yet? I could only achieve a score around 68 on the test set using the best checkpoint provided by author. Maybe there are some problems in my process.

warrior-yyyan commented 1 year ago

it seams CGTN‘s have higher performance in kaggle dataset. https://www.ijcai.org/proceedings/2022/0633.pdf

In the comparative data of CGTN's work, the data of Transformet-MD has obvious errors, and the author refuses to disclose the code, sends emails or even does not reply. I strongly doubt the authenticity of his work!

@warrior-yyyan I have the same suspicion, but mainly because of its unclear words of model hyper-parameters and Implementation Details. Of course, I did not get a replay too. Do you mean Transformers MD's result on Essay dataset has some problems in CGTN's paper? The result on Kaggle seems just copied from Transformers MD's paper. Could it be that this paper's author(D-DGCN) has the same suspicion, so they didn't list CGTN in baseline? I wonder how the author explains to the reviewer, or the reviewer just simply doesn't know about CGTN.

In CGTN, the data of Transformer-MD in Kaggle is directly quoted, and the data in Essays may be reproduced by them, but the average score is wrong. The paper gives 69.51, but in fact according to each dimension that works out to be 70.51. I don't know if it was a calculation error or a deliberate reduction, which is very confusing.

warrior-yyyan commented 1 year ago

@warrior-yyyan By the way, have you replicated the results of this paper yet? I could only achieve a score around 68 on the test set using the best checkpoint provided by author. Maybe there are some problems in my process.

Since I doubt the authenticity and I don't know much about contrastive learning, I didn't choose to reproduce it. By the way, you mean you reproduced but the result is not good right? Can you share your repro code and solve the problem together?

hhhhhy60 commented 1 year ago

@warrior-yyyan By the way, have you replicated the results of this paper yet? I could only achieve a score around 68 on the test set using the best checkpoint provided by author. Maybe there are some problems in my process.

Since I doubt the authenticity and I don't know much about contrastive learning, I didn't choose to reproduce it. By the way, you mean you reproduced but the result is not good right? Can you share your repro code and solve the problem together?

Sorry for the confusion, I meant D-DGCN. As for CGTN, I think it would be impossible to reproduce it solely based on the information provided in its paper. What I want to say is that I only got a result of about 68 when testing on the test set using the checkpoint provided by D-DGCN. I also tried going through the training process again but also only got 69 for the highest result. I'm wondering if you have tried to reproduce D-DGCN and encountered the same issue or i got some wrong setting. And if you want further discussion about CGTN, you can add me on WeChat through my homepage. I still did some experiments.

djz233 commented 1 year ago

Hi, we upload a new checkpoint of kaggle (https://drive.google.com/file/d/1lUSZUUVExszKqkhc2pvA-TOBROICzh3n/view?usp=sharing). You can test it with seed==321.

hhhhhy60 commented 1 year ago

Hi, we upload a new checkpoint of kaggle (https://drive.google.com/file/d/1lUSZUUVExszKqkhc2pvA-TOBROICzh3n/view?usp=sharing). You can test it with seed==321.

Thank you for your reply! May I ask what was the reason for the first question about CGTN? @djz233

TaoYang225 commented 1 year ago

Thank you for your reply! May I ask what was the reason for the first question about CGTN? @djz233

Hello! we tried to retest CGTN for comparison, but unfortunately the authors did not release their code. Besides, we are not sure whether CGTN used the same split data as us by considering the large performance difference.

hhhhhy60 commented 1 year ago

Hi, we upload a new checkpoint of kaggle (https://drive.google.com/file/d/1lUSZUUVExszKqkhc2pvA-TOBROICzh3n/view?usp=sharing). You can test it with seed==321.

Hello, it seems two checkpoints are exactly same. @djz233 my script:

`import torch import logging

if name == "main":

logging.basicConfig(level=logging.INFO)

model_state_dict_1 = torch.load('best_f1_dggcn_kaggle.pth')
model_state_dict_2 = torch.load('best_f1_dggcn_kaggle_321.pth')

result = []
for name in model_state_dict_1:
    is_equal = torch.equal(model_state_dict_1[name],model_state_dict_2[name])
    result.append(is_equal)

is_same = all(result)

logging.info(is_same)

` INFO:root:True

TaoYang225 commented 1 year ago

@hhhhhy60 https://github.com/djz233/D-DGCN/issues/4#issuecomment-1624946160