I am trying to fine tune TAPAS WTQ model on dummy data of 990 rows and 18 columns ('Nobel Laureates, 1901-Present' dataset: https://www.kaggle.com/datasets/nobelfoundation/nobel-laureates). I am running the notebook on Kaggle with max RAM of 30 GB. However, I am encountering issues while encoding using TapasTokenizer.
When the parameter truncation=False, the error is - "ValueError: Too many rows" and when the parameter truncation=True, it is - "ValueError: Couldn't find all answers".
The encoding code looks like below:
encoding = tokenizer(table=table, queries=item.question, answer_coordinates=item.answer_coordinates, answer_text=item.answer_text, truncation=True, padding="max_length", return_tensors="pt")
encoding.keys()
Can anyone let me know what is the maximum size of the data i.e., maximum number of rows and columns which can be handled by the model?
Hi,
I am trying to fine tune TAPAS WTQ model on dummy data of 990 rows and 18 columns ('Nobel Laureates, 1901-Present' dataset: https://www.kaggle.com/datasets/nobelfoundation/nobel-laureates). I am running the notebook on Kaggle with max RAM of 30 GB. However, I am encountering issues while encoding using TapasTokenizer. When the parameter truncation=False, the error is - "ValueError: Too many rows" and when the parameter truncation=True, it is - "ValueError: Couldn't find all answers".
The encoding code looks like below: encoding = tokenizer(table=table, queries=item.question, answer_coordinates=item.answer_coordinates, answer_text=item.answer_text, truncation=True, padding="max_length", return_tensors="pt") encoding.keys()
Can anyone let me know what is the maximum size of the data i.e., maximum number of rows and columns which can be handled by the model?