The current package does not include any metrics of clinical utility. Run the package on any data set and you'll get a set of metrics, e.g. discrimination and calibration, but you won't be able to tell whether those are good enough to justify use of the tool (e.g. what if an AI algorithm has better discrimination but worse calibration than a competitor? which do you use?).
Possible Solution
Add in code for DCA (which has already been written).
Problem Summary
Decision curve analysis is recommended in pretty much every guideline on the evaluation of AI, including TRIPOD AI and also see https://mskcc-epi-bio.github.io/decisioncurveanalysis/literature.html#DCA__Machine_LearningAI. It is not currently in the package.
Impact
The current package does not include any metrics of clinical utility. Run the package on any data set and you'll get a set of metrics, e.g. discrimination and calibration, but you won't be able to tell whether those are good enough to justify use of the tool (e.g. what if an AI algorithm has better discrimination but worse calibration than a competitor? which do you use?).
Possible Solution
Add in code for DCA (which has already been written).
Steps to Reproduce
NA
Suggested fix
see above.