Open nirmal2k opened 2 years ago
Hi @nirmal2k, yes you can use p_man_len as 512 and encode using it. castorini/unicoil-d2q-msmarco-passage is trained with p_max_len 192.
corpus-d2q contains original msmarco-passage text token+ [SEP] + new tokens generated from doc2query
How is corpus-d2q is prepared? On what p_max_len is castorini/unicoil-d2q-msmarco-passage trained? Can I use p_max_len as 512 and encode using it?