Closed pfliu-nlp closed 1 year ago
Comment: I wonder if you'd be able to use a method for aggregation like I did here? https://github.com/neulab/ExplainaBoard/blob/main/explainaboard/metrics/nlg_meta_evaluation.py#L115-L141
It seems that it would then be possible to avoid removing that check.
Comment: I wonder if you'd be able to use a method for aggregation like I did here? https://github.com/neulab/ExplainaBoard/blob/main/explainaboard/metrics/nlg_meta_evaluation.py#L115-L141
It seems that it would then be possible to avoid removing that check.
@neubig (@odashi ) I think about this but don't think it could work since once we perform
data.reshape((data.shape[0], data.shape[-2] * data.shape[-1]))
It's hard for us to recover the data to its original shape. (In the above case, data.shape[-1]
has fixed dimension and is hard-coded as 4
. In the current case, it's dynamic.
We need to figure out a way to fix this since it also blocks the PR: https://github.com/neulab/ExplainaBoard/pull/526.
@pfliu-nlp For a quick fix, you can also store the size of the dimension as another stats.
@pfliu-nlp For a quick fix, you can also store the size of the dimension as another stats.
Yeah, that would be another solution we can consider. But if following this, I feel like the function _aggregate_stats
and calc_stats_from_data
have been hacked too much. How do you think @neubig
@pfliu-nlp Yes, it is definitely a hack, but it looks better than mitigating a restriction of the methods' presuppositions.
Blocked by: https://github.com/neulab/ExplainaBoard/pull/526
Based on evaluation metrics achieved in PR 526, this PR aims to introduce task processor.
Notably, the shape check process in function
aggregate_stats()
is still too strong: https://github.com/neulab/ExplainaBoard/blob/b31d5d6506bdb6fb633b836ef798f25488f4052d/explainaboard/metrics/metric.py#L406 I further relax it in this PR. We can discussion more about this.