Closed kuanhsieh closed 7 months ago
thanks for raising this! i am able to reproduce your bug. let me think a little bit about how to make this more clear for other users.
basically what's happening is here i check if pretrained is none and then try to load a NomicBertModel
model from the model_name
path, however the model is saved as a BiEncoder
model.
the quick fix is to replace the following in your yaml
model_name: nomic-ai/nomic-bert-2048
pretrained: <path to checkpoint>
i'll think a little bit about how to make this cleaner. the reason i save the BiEncoder
model vs. the underlying trunk
object is that there are scenarios where there are learnable layers after the trunk in the BiEncoder
model.
let me know if you still face any issues with this!
Hi, thank you very much for this! I sort of understand your thinking now. What you suggested worked without any issues. Thank you for being so responsive and helpful!
Hi,
I followed the steps in the
README.md
and ran the suggested command to do contrastive pretraining:I changed was the
output_dir
variable inconfigs/train/contrastive_pretrain.yaml
so that it would store the model on local disk ,e.g., I set it tooutput_dir: "nomic-embed-text-v1-unsupervised-1st-try"
. I also changed the data configconfigs/data/contrastive_pretrain.yaml
so that I only used a subset of the data (to test it out).However, when then I then went to run the contrastive fine-tuning step, this time changing the
model_name
inconfigs/train/contrastive_finetune.yaml
from"nomic-ai/nomic-embed-text-v1-unsupervised"
to"nomic-embed-text-v1-unsupervised-1st-try/final_model"
so that I could try my own contrastive pretrained model, and changing the data configconfigs/data/finetune_triplets.yaml
so that I only used a subset of the data, I got an_IncompatibleKeys
error.I believe this comes from the
load_state_dict
function. The missing keys and unexpected keys were of the form (for all 112 keys I think):i.e., I think the unexpected_keys had an extra "trunk." prefix added to it causing the error (they both had exactly 112 keys).
I tried removing the "trunk." prefix (like a simplified version of what the
remap_bert_state_dict
function does incontrastors/models/encoder/bert.py
) and reran, but then got the followingTensor
size mismatch error:I'm not sure why there would be such a mismatch difference. Could someone please advise?
Many thanks.