neulab / ExplainaBoard

Interpretable Evaluation for AI Systems
MIT License
361 stars 36 forks source link

list[(name, Thing)] to dict[name, Thing], and remove MetricConfig and directly store configs in Metric #491

Open odashi opened 2 years ago

odashi commented 2 years ago

EDIT: This issue is currently represents two separate topics:

Original proposal:

Metric is instantiated by appropriate MetricConfig. This strategy is used only in Metric and any other classes are instantiated by directly calling __init__ with specific arguments. This special architecture around Metric actually causes some confusion (e.g., #455). This separation also requires explicit circular dependency: Metric holds MetricConfig, while MetricConfig knows corresponding Metric (in to_metric). This is somewhat annoying because it always lets us to refactor both concepts at the same time.

As long as I checked the all subclasses of Metric, there is no heavy implementation: they simply hold configs without any special initialization processes. It looks we have no any reason to separate MetricConfig from Metric at interface level. Integrating them into the Metric could simplify the architecture.

RFC: @neubig @pfliu-nlp @tetsuok

odashi commented 2 years ago

MetricConfig is stored in MetricResult too, but it seems this information is not necessarily required. There is also Performance that has almost the same information with MetricResult, and the only difference is that it stores metric's name instead. I think using Performance rather than MetricResult is sufficient. We can pass the original Metric with the corresponding Performance if the both should be referred.

odashi commented 2 years ago

Potential change would be:

I'm still waiting for comments @neubig @pfliu-nlp @tetsuok

tetsuok commented 2 years ago

@odashi I agree with the proposal and the potential changes. Removing the circular dependency seems a good idea to make things simpler.

pfliu-nlp commented 2 years ago

@odashi

"Remove MetricConfig, store every information necessary to perform metrics to Metric."

In general, I think it looks good. I think the reason why we have two separate classes originally (Metric and MetricConfig) is to handle the following case (ref: https://github.com/neulab/ExplainaBoard/blob/cc2d315695cec30e21d369362d1c76ca2ce9d40c/explainaboard/processors/kg_link_tail_prediction.py#L134)

            HitsConfig(name='Hits1', hits_k=1),
            HitsConfig(name='Hits2', hits_k=2),
            HitsConfig(name='Hits3', hits_k=3),
            HitsConfig(name='Hits5', hits_k=5),
            HitsConfig(name='Hits10', hits_k=10),

When we merge MetricConfig into Metric, I think we will have two names for each Metric class: metric_name and metric_class_name.

(Again, I think this refactoring will potentially affect explainaboard web and I will follow PRs on this discussion.)

odashi commented 2 years ago

When we merge MetricConfig into Metric, I think we will have two names for each Metric class: metric_name and metric_class_name.

It looks natural to me, though it also looks the name field needs to be separated from Metric because it behaves a key of the objects. Currently, metric configss are stored as list[MetricConfig] with internal key name, but it is more natural to reform it to dict[str, Metric] with explicit dictinary key so that metric_name in other classes point to the key of this dict, not the member of Metric.

pfliu-nlp commented 2 years ago

Hi, @odashi

"Currently, metric configss are stored as list[MetricConfig] with internal key name, but it is more natural to reform it to dict[str, Metric]"

Yes, this is another good point: do we need to shift the way of storing MetricConfig (or Metric if we have merged them) from List[Metric] to dict[str, Metric]? In the explainaboard web, sometimes we need to manually get dict[str, MetricConfig] from List[MetricConfig] for some convenience.

A similar thing happens in the AnalysisLevels.

odashi commented 2 years ago

@pfliu-nlp I think we can first implement dict[str, MetricConfig] and then come back to this issue. (Generally I recommend dict[key, something] over list[(key, something)])

odashi commented 2 years ago

I found that EaaSMetric has some abuse of the name field: it is used to specify external metric. https://github.com/neulab/ExplainaBoard/blob/215785c7f8ad938ae74c47b398ca82419ecc12a5/explainaboard/metrics/eaas.py#L107 EaaSMetricConfig should have a field to specify this value (it may have the same name name)

neubig commented 2 years ago

Sorry about the late comment on this, but I agree with getting rid of MetricConfig. Sounds good!