Closed fengjiejiejiejie closed 6 months ago
CE metrics computed by the cheXpert are as follows: Precision: 0.324 Recall: 0.199 F1: 0.196
I guess I might have gone wrong at one step or another.
We adopt the cheXpert labeler for CE metric, and i will check the model and result soon.
We adopt the cheXpert labeler for CE metric, and i will check the model and result soon.
Thanks for your reply. I just used the given checkpoint for inference. The attached file contains the predicted report. Perhaps you could review it. I suspect I may have made an error in calculating the CE index.
Hi, Have you finished the check? I'm eager to know what is wrong with me?
Hi, Sorry to bother you again.
labeled_reports_vlci.csv labeled_reports_vlci_gt.csv
The above attach files are pred report and gt report with the given code and checkpoint (also with the labels extracted by cheXpert) . The CE metrics are as follows:
'F1_MACRO': 0.1964923716414105, 'F1_MICRO': 0.3627544833748166, 'PRECISION_MACRO': 0.3236966163372005, 'PRECISION_MICRO': 0.45363128491620114, 'RECALL_MACRO': 0.19916758254963002, 'RECALL_MICRO': 0.30221182475542324
I want to konw 1. Whether the predicted report is correct and same as yours; and 2. Whether the extracted clinincal labels are same as yours.
I eagerly await your insights on this matter. Best
It is different, but I still don`t know why 😂
fengjiejiejiejie @.***> 于2024年1月8日周一 19:40写道:
Hi, Sorry to bother you again.
labeled_reports_vlci.csv https://github.com/WissingChen/VLCI/files/13859438/labeled_reports_vlci.csv labeled_reports_vlci_gt.csv https://github.com/WissingChen/VLCI/files/13859441/labeled_reports_vlci_gt.csv
The above attach files are pred report and gt report with the given code and checkpoint (also with the labels extracted by cheXpert) . The CE metrics are as follows:
'F1_MACRO': 0.1964923716414105, 'F1_MICRO': 0.3627544833748166, 'PRECISION_MACRO': 0.3236966163372005, 'PRECISION_MICRO': 0.45363128491620114, 'RECALL_MACRO': 0.19916758254963002, 'RECALL_MICRO': 0.30221182475542324
I want to konw 1. Whether the predicted report is correct and same as yours; and 2. Whether the extracted clinincal labels are same as yours.
I eagerly await your insights on this matter. Best
— Reply to this email directly, view it on GitHub https://github.com/WissingChen/VLCI/issues/9#issuecomment-1880840986, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHQFUTN3X5DPTE5CVVMKXS3YNPLMDAVCNFSM6AAAAABBOBWQHGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBQHA2DAOJYGY . You are receiving this because you commented.Message ID: @.***>
It is different, but I still don`t know why 😂 fengjiejiejiejie @.> 于2024年1月8日周一 19:40写道: … Hi, Sorry to bother you again. labeled_reports_vlci.csv https://github.com/WissingChen/VLCI/files/13859438/labeled_reports_vlci.csv labeled_reports_vlci_gt.csv https://github.com/WissingChen/VLCI/files/13859441/labeled_reports_vlci_gt.csv The above attach files are pred report and gt report with the given code and checkpoint (also with the labels extracted by cheXpert) . The CE metrics are as follows: 'F1_MACRO': 0.1964923716414105, 'F1_MICRO': 0.3627544833748166, 'PRECISION_MACRO': 0.3236966163372005, 'PRECISION_MICRO': 0.45363128491620114, 'RECALL_MACRO': 0.19916758254963002, 'RECALL_MICRO': 0.30221182475542324 I want to konw 1. Whether the predicted report is correct and same as yours; and 2. Whether the extracted clinincal labels are same as yours. I eagerly await your insights on this matter. Best — Reply to this email directly, view it on GitHub <#9 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHQFUTN3X5DPTE5CVVMKXS3YNPLMDAVCNFSM6AAAAABBOBWQHGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBQHA2DAOJYGY . You are receiving this because you commented.Message ID: @.>
So, our predicted reports are same, but extracted labels from chexpert are different? Can you give me your predicted report with labels extracted from chexpert? So that I can check the results.
Hi, Weixing, I extend my gratitude for your generosity in sharing the open-source code. However, I have encountered challenges in replicating the clinical metrics (i.e., F1) outlined in your paper using the provided checkpoint on the MIMIC-CXR dataset.
In the process of computing CE metrics, I felt that I might have gone wrong at one step or another.
To elaborate, when utilizing your pretrained VLCI model on the MIMIC-CXR dataset, our obtained NLP and clinical metrics are as follows: BLEU4: 0.113 METEOR: 0.144 ROUGE_L: 0.276 CIDEr: 0.174
Precision: 0.314 Recall: 0.181 F1: 0.179
Here are a few key points:
labeled_reports_gts.csv labeled_reports_res.csv
I eagerly await your insights on this matter. Best