Open joshplasse opened 3 years ago
Hi @joshplasse can you share some information about your dataset, is it public? How big is it? By altering the number of rows and length solely for fine-tuning some of the embeddings (positional and row_index) will have to be trained from random and potentially contributes to the problems you are seeing. I recommend considering the following options:
reset
since re-starting the positional embeddings at every cell helps with generalizing to longer sequencesFinally we have some other works in the way on the problem on larger tables, hence we would love to see if there are public datasets where this problem is evident. One such paper and code release should be coming up in the next few weeks.
Hi -- I am trying to fine-tune the wikisql base model to accurately make predictions for tables that have up to 150 rows and have noticed that the prediction accuracy goes down significantly when considering larger tables. I have updated the config file so that
max_row_num=150
, and have increasedmax_length=1024
. I am able to successfully fine-tune the model using these updated parameters, and the changes allow TAPAS to predict rows later in the table (e.g., row idx > 64). However, the model rarely makes correct predictions for queries whose answer coordinates appear past row 64. Are there other config parameters that need to be considered when making predictions on larger tables?Further, I am aware that transformer's computational complexity is quadratic in the length of the tokenized sequence; although, using GPUs I am able to fine-tune the model with an increased sequence length (which allows for all the training tables to be tokenized without truncation). Are there any resources that discuss a degradation in accuracy when making predictions on larger tables?