google-research / tapas

End-to-end neural table-text understanding models.
Apache License 2.0
1.13k stars 216 forks source link

Populate float_answer for Tapas Weak supervision for aggregation (WTQ). TypeError: Parameter to CopyFrom() must be instance of same class: expected language.tapas.Question got str. #162

Open ayazhankadessova opened 2 years ago

ayazhankadessova commented 2 years ago

I am trying to fine-tune Tapas following the instructions here: https://huggingface.co/transformers/v4.3.0/model_doc/tapas.html#usage-fine-tuning , Weak supervision for aggregation (WTQ) using the https://www.microsoft.com/en-us/download/details.aspx?id=54253 , which follow the required format of dataset in the SQA format, tsv files with most of the named columns. But, there is no float_answer column. And as mentioned,

float_answer: the float answer to the question, if there is one (np.nan if there isn’t). Only required in case of weak supervision for aggregation (such as WTQ and WikiSQL)

Since I am using WTQ, I need the _floatanswer column. I tried populating float_answer based on answer_text as suggested here, using https://github.com/google-research/tapas/blob/master/tapas/utils/interaction_utils_parser.py 's parse_question(table, question, mode) function. However, I am getting errors.

I copied everything from here and put these args: Screenshot 2022-06-15 at 2 07 13 PM .

But, I get this error: TypeError: Parameter to CopyFrom() must be instance of same class: expected language.tapas.Question got str.

Screenshot 2022-06-15 at 2 08 07 PM

1) Can you, please help understand what args should I Use or how else can I populate float_answer?

I am using table_csv and the question, answer to which is in the table given:

Screenshot 2022-06-15 at 2 10 21 PM

2) Also we have tried to simply add float_answer column and make all the values np.nan. Crashed, too.

encoding["float_answer"] = torch.tensor(float("nan"))

Is there tutorial for WTQ fine-tuning? Thanx!

Deepazzz commented 1 year ago

Hi, Did you get any soln for the same, I facing the same issue.

mohammedshaneeb-ai commented 1 year ago

HI @ayazhankadessova How you handled float_answer column ( in the case of string answers like name,place) if there is no answer . and also how you handled answer_coordinates column ( in the case of aggregations answers like sum,avg etc) if there is no answer