Closed hhhhhy60 closed 11 months ago
it seams CGTN‘s have higher performance in kaggle dataset. https://www.ijcai.org/proceedings/2022/0633.pdf
In the comparative data of CGTN's work, the data of Transformet-MD has obvious errors, and the author refuses to disclose the code, sends emails or even does not reply. I strongly doubt the authenticity of his work!
it seams CGTN‘s have higher performance in kaggle dataset. https://www.ijcai.org/proceedings/2022/0633.pdf
In the comparative data of CGTN's work, the data of Transformet-MD has obvious errors, and the author refuses to disclose the code, sends emails or even does not reply. I strongly doubt the authenticity of his work!
@warrior-yyyan I have the same suspicion, but mainly because of its unclear words of model hyper-parameters and Implementation Details. Of course, I did not get a replay too. Do you mean Transformers MD's result on Essay dataset has some problems in CGTN's paper? The result on Kaggle seems just copied from Transformers MD's paper. Could it be that this paper's author(D-DGCN) has the same suspicion, so they didn't list CGTN in baseline? I wonder how the author explains to the reviewer, or the reviewer just simply doesn't know about CGTN.
@warrior-yyyan By the way, have you replicated the results of this paper yet? I could only achieve a score around 68 on the test set using the best checkpoint provided by author. Maybe there are some problems in my process.
it seams CGTN‘s have higher performance in kaggle dataset. https://www.ijcai.org/proceedings/2022/0633.pdf
In the comparative data of CGTN's work, the data of Transformet-MD has obvious errors, and the author refuses to disclose the code, sends emails or even does not reply. I strongly doubt the authenticity of his work!
@warrior-yyyan I have the same suspicion, but mainly because of its unclear words of model hyper-parameters and Implementation Details. Of course, I did not get a replay too. Do you mean Transformers MD's result on Essay dataset has some problems in CGTN's paper? The result on Kaggle seems just copied from Transformers MD's paper. Could it be that this paper's author(D-DGCN) has the same suspicion, so they didn't list CGTN in baseline? I wonder how the author explains to the reviewer, or the reviewer just simply doesn't know about CGTN.
In CGTN, the data of Transformer-MD in Kaggle is directly quoted, and the data in Essays may be reproduced by them, but the average score is wrong. The paper gives 69.51, but in fact according to each dimension that works out to be 70.51. I don't know if it was a calculation error or a deliberate reduction, which is very confusing.
@warrior-yyyan By the way, have you replicated the results of this paper yet? I could only achieve a score around 68 on the test set using the best checkpoint provided by author. Maybe there are some problems in my process.
Since I doubt the authenticity and I don't know much about contrastive learning, I didn't choose to reproduce it. By the way, you mean you reproduced but the result is not good right? Can you share your repro code and solve the problem together?
@warrior-yyyan By the way, have you replicated the results of this paper yet? I could only achieve a score around 68 on the test set using the best checkpoint provided by author. Maybe there are some problems in my process.
Since I doubt the authenticity and I don't know much about contrastive learning, I didn't choose to reproduce it. By the way, you mean you reproduced but the result is not good right? Can you share your repro code and solve the problem together?
Sorry for the confusion, I meant D-DGCN. As for CGTN, I think it would be impossible to reproduce it solely based on the information provided in its paper. What I want to say is that I only got a result of about 68 when testing on the test set using the checkpoint provided by D-DGCN. I also tried going through the training process again but also only got 69 for the highest result. I'm wondering if you have tried to reproduce D-DGCN and encountered the same issue or i got some wrong setting. And if you want further discussion about CGTN, you can add me on WeChat through my homepage. I still did some experiments.
Hi, we upload a new checkpoint of kaggle (https://drive.google.com/file/d/1lUSZUUVExszKqkhc2pvA-TOBROICzh3n/view?usp=sharing). You can test it with seed==321.
Hi, we upload a new checkpoint of kaggle (https://drive.google.com/file/d/1lUSZUUVExszKqkhc2pvA-TOBROICzh3n/view?usp=sharing). You can test it with seed==321.
Thank you for your reply! May I ask what was the reason for the first question about CGTN? @djz233
Thank you for your reply! May I ask what was the reason for the first question about CGTN? @djz233
Hello! we tried to retest CGTN for comparison, but unfortunately the authors did not release their code. Besides, we are not sure whether CGTN used the same split data as us by considering the large performance difference.
Hi, we upload a new checkpoint of kaggle (https://drive.google.com/file/d/1lUSZUUVExszKqkhc2pvA-TOBROICzh3n/view?usp=sharing). You can test it with seed==321.
Hello, it seems two checkpoints are exactly same. @djz233 my script:
`import torch import logging
if name == "main":
logging.basicConfig(level=logging.INFO)
model_state_dict_1 = torch.load('best_f1_dggcn_kaggle.pth')
model_state_dict_2 = torch.load('best_f1_dggcn_kaggle_321.pth')
result = []
for name in model_state_dict_1:
is_equal = torch.equal(model_state_dict_1[name],model_state_dict_2[name])
result.append(is_equal)
is_same = all(result)
logging.info(is_same)
` INFO:root:True
it seams CGTN‘s have higher performance in kaggle dataset. https://www.ijcai.org/proceedings/2022/0633.pdf