google-research / tapas

End-to-end neural table-text understanding models.
Apache License 2.0
1.15k stars 217 forks source link

What's the meaning of num_aggregation_labels? Why is it 0 for TabFact #87

Closed ghost closed 3 years ago

ghost commented 4 years ago

(Sent by mail, parahprasing here so folks can later find the answer)

The main TAPAS paper describes 3 aggregation functions. Why is num_aggregation_labels 0 for TabFact? Does this mean no aggregation will be performed in the TabFact model?

ghost commented 4 years ago

num_aggregation_labels should be either 0 (no aggregation) or 3 (uses SUM, COUNT, AVERAGE).

Currently we only use this in QA tasks where it will just be another output of the model. For the weakly supervised QA learning we have a special loss (as discussed in the paper) that will compute an estimated float answer based on the aggregation probabilities (new aggregations could be added by extending the implementation of this loss).

For TabFact, we also tried to use the aggregations internally so that the model could more easily judge statements that require aggregation (e.g., "the number of players with more than 3 wins is 5."), but we never got this to work much better than using no aggregation at all. Therefore we didn't discuss it in the final paper.