google-research / tapas

End-to-end neural table-text understanding models.
Apache License 2.0
1.15k stars 217 forks source link

Additional operations for Weakly Supervised Training #118

Closed AhmedMasryKU closed 3 years ago

AhmedMasryKU commented 3 years ago

As far as I know Tapas currently supports three main operations (SUM, AVERAGE, COUNT). My dataset has some questions which require an additional operation (e.g. What is the difference between A and B?). For strong supervision, this won't be an issue since we will have labels for both the cells and the aggregation operation.

However, for weak supervision, the current TAPAS model loss function doesn't support the difference operator. Do you have any suggestions to modify the weak supervision to learn such questions?

eisenjulian commented 3 years ago

Hi @AhmedMasryKU I think there's more than one way to implement a weakly supervised difference operator. One way would require to have 2 cell selection heads instead of just one like we have at the moment. Having that, you can use a softmax to get a distribution over cells $p_A$, and a distribution over cells $p_B$ (the distribution could be restricted to cells with numeric values), and then compute the expected value of the difference of the expected cell value for each head.

The expected difference value would be \sum_{c\in C} p_A(c) v(c) - p_B(c) v(c) where $C$ is the set of all cells and the function $v$ gives you the numeric value of that cell. Keep us posted if you implement this or any other ideas and how it works.

AhmedMasryKU commented 3 years ago

Hi, I actually tried this idea before, but I think I didn't restrict the distribution to the numeric cells which was probably affecting the performance. I will try it once again and see if it works. Thanks

mohandas1 commented 3 years ago

Hi, @AhmedMasryKU can u please tell how to get the sum of a column in the table. I have used both WTQ and WIKISQL but not getting the sum. This is the op i am getting when i run the query: "what is the sum of Runs?" SUM > 18426, 14234, 13704, 13430, 12650, 11867, 11739, 11579, 11363, 10889

eisenjulian commented 3 years ago

Closing for the time being, feel free to reopen or update if needed