jxmorris12 / vec2text

utilities for decoding deep representations (like sentence embeddings) back to text
Other
727 stars 83 forks source link

Model Evaluation #66

Open victoriazinkovich opened 1 month ago

victoriazinkovich commented 1 month ago

Dear Jack, thank you for such a wonderful job! I have a small question, when you have time to answer, about getting metrics for corrector models. So, the code below works great for inversion models:

from vec2text import analyze_utils

experiment, trainer = analyze_utils.load_experiment_and_trainer_from_pretrained(
     "/home/vec2text/saves/lisa-1/checkpoint-854900"
)
train_datasets = experiment._load_train_dataset_uncached(
    model=trainer.model,
    tokenizer=trainer.tokenizer,
    embedder_tokenizer=trainer.embedder_tokenizer
)

val_datasets = experiment._load_val_datasets_uncached(
    model=trainer.model,
    tokenizer=trainer.tokenizer,
    embedder_tokenizer=trainer.embedder_tokenizer
)
trainer.args.per_device_eval_batch_size = 16
trainer.sequence_beam_width = 1
trainer.num_gen_recursive_steps = 20
trainer.evaluate(
    eval_dataset=train_datasets["validation"]
)

I added paths and arguments of inversion and corrector models into CHECKPOINT_FOLDERS_DICT and ARGS_DICT correspondingly. However, when I am trying to estimate my custom corrector model, the code fails, since corrector model has no attribute "call_embedding_model": assert hasattr(model, "call_embedding_model"). This code executes since I want to use frozen embeddings and has self.model_args.use_frozen_embeddings_as_input = True. When I set this flag to False, everything works. Is it some kind of a small bug or I am doing something wrong?

Thanks in any case! Best Regards, Viktoriia Zinkovich.