Open simran-khanuja opened 3 months ago
@simran-khanuja not very sure but I came across above issue with the latest released SigLIP (so400m patch16) and for me the tokenizer was different + vocab dim should've been 256k (I fixed it during initialization).
Maybe try a different tokenizer (I confirmed with google folks there seems to be a mistake with config for my case, it might be different for you as well), MSigLIP tokenizer spiece model exists here so swapping tokenizer should work. Also if you're ok with using PyTorch mSigLIP is implemented at transformers, you can use that for time being here.
edit: the new notebook seems to be fixed
Hi, I get this error when preprocessing text using the mSigLIP model. Any idea what may be wrong? I didn't change anything in the demo colab