ArvinZhuang / DSI-transformers

A huggingface transformers implementation of "Transformer Memory as a Differentiable Search Index"
MIT License
155 stars 14 forks source link

Datasets setting? #4

Open zalmanchen opened 1 year ago

zalmanchen commented 1 year ago

Why is the training dataset the same as the evaluation set, and how do you process the origin validation dataset? image

ArvinZhuang commented 1 year ago

Hi @zalmanchen,

The eval_dataset in the script is just for checking if the model is trying to overfit the training data, this is just for analysis purposes. Feel free to comment it out, and do not set eval_dataset for the trainer.

the code for processing the data is in data/NQ/create_NQ_train_vali.py /