Open ZhiyuanChen opened 1 year ago
Hi @ZhiyuanChen , thanks for creating this issue! Could you point to memory increases if using multiple metrics like AUROC and AUPRC?
If so, as pointed out, this implementation be added outside of the core library using the Metric interface & toolkit. Are you interested in having this implementation be available directly in torcheval
library?
If so, this would need to be tightly scoped to metrics like AUROC and AUPRC which store exactly the same states, e.g. inputs
and targets
. I don't think we should generalize this further as the input checking & validation can get very messy. We can reuse the functional metrics code.
Are you interested in contributing this to the library?
Thank you for your kind reply.
Could you point to memory increases if using multiple metrics like AUROC and AUPRC?
I'll need to double check it, usually for a standard image classification task, the output size is (1000000, 1000) while the target size is (1000000). Memory increase is actually not the crucial part, the synchorisation is. It takes a long time to sync across the whole cluster, especially when you are running with 256 or more GPUs.
If so, as pointed out, this implementation be added outside of the core library using the Metric interface & toolkit. Are you interested in having this implementation be available directly in torcheval library?
Are you interested in contributing this to the library?
Yes, I think it's a common usage and would like to contribute to torcheval.
If so, this would need to be tightly scoped to metrics like AUROC and AUPRC which store exactly the same states, e.g. inputs and targets. I don't think we should generalize this further as the input checking & validation can get very messy. We can reuse the functional metrics code.
I think there are more, for example, we use Pearson, Spearman, R^2 and RMSE for regression tasks. But I agree they should be limited to specific input/targets.
Hi @ZhiyuanChen, this makes sense to us! Feel free to make a pull request!
Hi @ZhiyuanChen, this makes sense to us! Feel free to make a pull request!
Sure thing. I'll need some time to improve this code. Currently it syncs all states when computing average scores, which is unnecessary. I will have two lists and use merge_state to sync incrementally.
🚀 The feature
Metrics class that shares the state for multiple metrics.
Motivation, pitch
Usually we need to compute multiple metrics for a task. And it is very inefficient to store multiple copies of inputs & targets for each task. It is more desired to have a metric class that share the states across multiple metrics.
Alternatives
No response
Additional context