Zasder3 / train-CLIP

A PyTorch Lightning solution to training OpenAI's CLIP from scratch.
MIT License
653 stars 78 forks source link

custom tokenizer and text encoder #22

Open sinjohr opened 2 years ago

sinjohr commented 2 years ago

I want to use custom tokenizer and encoder trained from huggingface tokenizer.

After training the huggingface tokenizer, I got a json containing vocas.

However, I don't know how to feed this custom tokenizer with train_finetune.py.

Could you give some guide to set and use custom tokenizer?

tonyhuang33 commented 2 years ago

My problem is the same as yours. Please reply me if you solve it. Thank you