Open zbrnwpu opened 3 years ago
Which version of the training do you use?
In the latest version, 1 epoch are one iteration over the 500k train queries. On a V100, it takes less than an hour for 1 epoch.
In Version 2, 1 epoch had nearly 400 million train triplets. There you trained just for some time and then stopped the process.
I am using version 3, and I found that the training data of version 3 is also a triplet. I only changed the data loading code. The rest of the code is the same as yours. My training data has 2456171 triples. Due to GPU memory limitations, batchsize can only be set to 10.@nreimers
With batch size 10 - not sure if your model will be good. Larger batch size => better model.
With batch size 10 - not sure if your model will be good. Larger batch size => better model. thank you very much,I also want to ask you, what is the difference between your train_bi_encoder_V2 and train_bi_encoder_V3 codes, thank you for your reply
v2 uses hard negatives that are provided by the task organizers, which have been sourced using BM25.
v3 uses different systems to mine passages that are close to a query. A cross-encoder then scores them if they are relevant to the query or not.
hello,@nreimers,I would like to ask you how long it took you to train the model of the MS MARCO data. I used the Tesla V100 to train that requires 40h for one epoch.I am a beginner, thank you for your answers!