MinishLab / model2vec

Distill a Small Static Model from any Sentence Transformer
https://minishlab.github.io/
MIT License
413 stars 18 forks source link

HFValidationError using custom model #110

Closed tomsquest closed 3 weeks ago

tomsquest commented 3 weeks ago

Hi,

I am trying to use Model2Vec with a custom model we have. The distill seems to run fine, until it tries to call validate_repo_id with the custom path I have (./model).

  1. I have a custom model that I can load successfully:
from transformers import AutoModelForSequenceClassification

model_save_path = "./model"
model = AutoModelForSequenceClassification.from_pretrained(model_save_path)
  1. I distill the custom model:
from model2vec.distill import distill

model_save_path = "./model"
m2v_model = distill(model_name=model_save_path, pca_dims=256)
  1. The following error is raised:
HFValidationError

    This cell raised an exception: HFValidationError('Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: './model'.')
stephantul commented 3 weeks ago

Hi @tomsquest ,

Just using model as a path should do the trick. Could you let us know whether that works? If that doesn't work, you can always load the model and tokenizer yourself, and use distill_from_model

from model2vec.distill import distill_from_model

# Some way to get your model and tokenizer.
model, tokenizer = load_model_and_tokenizer(...)

m2v_model = distill_from_model(model, tokenizer, pca_dims=256)

Stéphan

tomsquest commented 3 weeks ago

Hi @stephantul ,

Yes, it's working using just model (instead of ./model). That was my workaround. 👍

stephantul commented 3 weeks ago

Alright! Happy to be of service, I'm closing this for now. Let me know if you have other issues, and please reopen or make another issue if that's the case. Thanks for using model2vec! 🙏