TAPAS unable to use weak supervision labels to finetune

Hi,

The TAPAS model used for WTQ is a model that learns to predict an operation (aggregation_function) and to select cells over which to apply the operation:

Example: 1- The models predict SUM aggregation_function and the cell coordinates (0,1) (0,2). The output: ~SUM((0,1) (0,2)) 2- The models predict NONE operation and the coordinates (0,1). The output: text from cell(0,1)

In case of aggregation, the model expects the list of coordinates (row index , column index) of all the cells to aggregate. You need to fill the different interaction proto field related to aggregation

1- repeated AnswerCoordinate answer_coordinates message AnswerCoordinate { optional int32 row_index = 1; optional int32 column_index = 2; } (In tf example represented by the feature called "label_ids".) 2- optional float float_value: contains the result of the aggregation 3- optional AggregationFunction aggregation_function: contains the aggregation type: NONE/ SUM/ AVERAGE or COUNT

Make sure that the 3 fields are filled otherwise the model won't learn the aggregation loss: In this case the model will try to find an answer text from the table. (The default aggregation is NONE, The default float_value=0)
If no coordinates are provided label_ids contains 0 values (the default values). If label_ids contains 0 that would create a mask and no aggregation loss is computed.

Question/comment: "currently, over 72% of the samples do not have any predicted answer coordinates":

I suspect that your model is trying to learn to select an answer form the table and not applying an aggregation. In case no answer text from table cell or no list of coordinates are provided the model will learn to predict no answer and empty list of coordinates.

Question/comment: "I have tried passing the labels tensor to be all zeros as try but that makes the model learn to not select any column" That's the expected behavior. I can suggest a method that you can try: you can try to implement a heuristic that finds candidate answers to pass to the model: In this case the model will be limited by the heuristic errors but at least it would learn an aggregation loss.

Thanks, Syrine

On Thu, Sep 30, 2021 at 6:05 AM shabbir @.***> wrote:

I am trying to Fine-tune the pretrained TAPAS WTQ model on a custom dataset. I have used both Hugging face Pytorch code and Tensorflow code present on github. My dataset has majority of samples with arithmetic operations, so they rely on scalar answer as supervision. The details of the problem faced with both codes are described below:

1.

Tensorflow The model gets trained and saves the intermediate checkpoints, I used different checkpoints to do inference on the test data. As the training progresses more and more samples predicted co-ordinates come to be an empty list. currently, over 72% of the samples do not have any predicted answer coordinates while on zero-shot setting on 7% were coming to be empty. So the conclusion is TAPAS model is not able to learn from weak supervision signals of the dataset in use. 2.

Pytorch

As we do not have the answer coordinates available, the coordinates are predicted by the utility by computing the cost matrix in the utility provided.

However, the utility returns 'None' as the result when it does not find a matching candidate from the table (which is the case whenever the answer is the result of an aggregation operation over cell values)

Now, if we pass 'None' as the answer coordinate to the TAPAS Tokenizer, we don't get the labels tensor as the result of that.

While using that tokenization output and passing it to the TAPAS model, it does not compute loss rather just returns the predicted answer coordinates and predicted aggregation operator (this is the case while we do inference)

I have tried passing the labels tensor to be all zeros as try but that makes the model learn to not select any column

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google-research/tapas/issues/141, or unsubscribe https://github.com/notifications/unsubscribe-auth/APARZONPVIJOKKVIN5ANS73UEPOYPANCNFSM5FBLF23Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

google-research / tapas

TAPAS unable to use weak supervision labels to finetune #141