google-research / tapas

End-to-end neural table-text understanding models.
Apache License 2.0
1.13k stars 216 forks source link

What is `bert/encoder` and `bert_1/encoder` for retriever model? #152

Open xhluca opened 2 years ago

xhluca commented 2 years ago

I needed to convert the models to torch, but I can't figure out whether bert/encoder corresponds to the question encoder and bert_1/encoder to the table encoder, or the other way around.

The variable names can be found when you directly read the ckpt file:

from tensorflow.python.training import py_checkpoint_reader

# Need to say "model.ckpt" instead of "model.ckpt.index" for tf v2
file_name = "./tapas_nq_hn_retriever_medium/model.ckpt"
reader = py_checkpoint_reader.NewCheckpointReader(file_name)

# Load dictionaries var -> shape and var -> dtype
var_to_shape_map = reader.get_variable_to_shape_map()
var_to_dtype_map = reader.get_variable_to_dtype_map()

for k in var_to_dtype_map.keys():
    print(k)

Which gives you the following:

text_projection
table_projection/adam_v
bert_1/pooler/dense/kernel/adam_v
bert_1/pooler/dense/kernel/adam_m
bert_1/pooler/dense/kernel
...
bert/encoder/layer_6/attention/self/key/bias/adam_m
bert/encoder/layer_6/attention/output/dense/bias
bert_1/encoder/layer_2/attention/self/value/kernel
...

It is not clear from reading the code which one is the question encoder, which makes it difficult to use the model on a different dataset.

xhluca commented 2 years ago

@eisenjulian

SomewhereInTheVastUniverse commented 5 months ago

@xhluca Hi, I'm a person who want to convert tensorflow retriever model to pytorch model for testing. I wonder if this work has been completed. and if so, could you share the model for me? Thank you.

xhluca commented 5 months ago

Sorry I'm not sure, id recommend taking a look at the Huggingface repo.

SomewhereInTheVastUniverse commented 5 months ago

@xhluca I found it in huggingface. Thank you. Do you know how to inferece and train the model? There is no description for it.

xhluca commented 5 months ago

I am not aware of it. I just finetune it using in-batch negative sampling following papers like DPR (a lot of examples by huggingface on how to do that), there might be ways to finetune TAPAS specifically but I am not aware of them.