Open mengweiwang opened 1 year ago
Hi,
Please see the updated README.md for the labels from Chen et al. https://github.com/aehrc/cvt2distilgpt2
I will look into the discrepency with the results.
Hi,
Please see the updated README.md for the labels from Chen et al. aehrc/cvt2distilgpt2
I will look into the discrepency with the results.
@anicolson
Yes, I am using this dataset and the precision has reached the level reported in the paper. However, the recall rate is low and cannot reach the level reported in the paper.
Also, I have checked and tested the updated source code. The CE metric results did not change much, and there is a bug in the latest source code when running it. The bug is as follows:
The bug that occurred while I was executing the cvt_21_to_distilgpt2
task.
On line 281 of transmodal.model.py
, the content is if not getattr(self, metric).compute_on_step:
.
It indicates that the compute_on_step
attribute does not exist.
Hi, there are some errors in the preprint, the correct results are reported in the updated repository.
I used the data format (only for the findings section) as R2Gen and R2GenCMN (Chen et al.) followed in this article, but I was unable to obtain the CE metric results mentioned in the paper.
I used the provided
epoch=8-val_chen_cider=0.425092.ckpt
model forcvt_21_to_distilgpt2
task and also testedepoch=0-val_chen_cider=0.410965.ckpt
model forcvt_21_to_distilgpt2_scst
task, but neither of them achieved the CE metric results mentioned in the paper.In terms of CE metric,
precision_macro
can reach the result mentioned in the paper, butrecall_macro
andf1_macro
cannot achieve it and there is a significant difference between them.When calculating CE metrics here, only text related to findings is considered; do I need to perform any other processing?
The results obtained from performing
cvt_21_to_distilgpt2
task are as follows: {'test_ce_f1_example': 0.36598095297813416, 'test_ce_f1_macro': 0.2593880891799927, 'test_ce_f1_micro': 0.4408090114593506, 'test_ce_num_examples': 3858.0, 'test_ce_precision_example': 0.4171517491340637, 'test_ce_precision_macro': 0.3600466549396515, 'test_ce_precision_micro': 0.4919118881225586, 'test_ce_recall_example': 0.3665845990180969, 'test_ce_recall_macro': 0.25423887372016907, 'test_ce_recall_micro': 0.3993246555328369, 'test_chen_bleu_1': 0.39292487502098083, 'test_chen_bleu_2': 0.24805393815040588, 'test_chen_bleu_3': 0.17164887487888336, 'test_chen_bleu_4': 0.1269991397857666, 'test_chen_cider': 0.3902686834335327, 'test_chen_meteor': 0.15456412732601166, 'test_chen_num_examples': 3858.0, 'test_chen_rouge': 0.286588191986084}The results in the paper are as follows: precision_macro: 0.3597 recall_macro: 0.4122 f1_macro: 0.3842
The results obtained from performing
cvt_21_to_distilgpt2_scst
task are as follows: {'test_ce_f1_example': 0.36484676599502563, 'test_ce_f1_macro': 0.26361414790153503, 'test_ce_f1_micro': 0.4410783648490906, 'test_ce_num_examples': 3858.0, 'test_ce_precision_example': 0.4175392985343933, 'test_ce_precision_macro': 0.3873042166233063, 'test_ce_precision_micro': 0.49624764919281006, 'test_ce_recall_example': 0.3643813729286194, 'test_ce_recall_macro': 0.2558453679084778, 'test_ce_recall_micro': 0.3969484865665436, 'test_chen_bleu_1': 0.39466917514801025, 'test_chen_bleu_2': 0.248764768242836, 'test_chen_bleu_3': 0.1718045324087143, 'test_chen_bleu_4': 0.1269892156124115, 'test_chen_cider': 0.37993040680885315, 'test_chen_meteor': 0.15499255061149597, 'test_chen_num_examples': 3858.0, 'test_chen_rouge': 0.28760746121406555}Reproduced the above content, only modifying the task parameters in task/mimic_cxr_jpg_chen/jobs.yaml.