Open shamanez opened 2 years ago
Did you mean updating custom document entries to the index for retrieval during fine-tuning?
Yes exactly.
Although the paper mentions that
While asynchronous refreshes can be used for both pre-training and fine-tuning, in our experiments we only use it for pre-training. For fine-tuning, we just build the MIPS index once (using the pre-trained θ) for simplicity and do not update Embed_doc. Note that we still fine-tune Embed_input, so the retrieval function is still updated from the query side.
The TF implementation of REALM, i.e. their experiments, freezes the evidence blocks and thereby doesn't update indexes when fine-tuning, meaning that adding custom documents is not allowed.
Therefore, if any custom documents needed for retrieval, we have to either
transformers
.Option 3 has been implemented, please check out the readme and see if that matches your need :-)
Similar to the RAG in the Transformers library can we use a custom KB?