extract features from CLIP with dim_text and dim_image other than 512

lucidrains / x-clip

A concise but complete implementation of CLIP with various experimental improvements from recent papers

MIT License

682 stars 46 forks source link

Closed abhishekaich27 closed 1 year ago

abhishekaich27 commented 2 years ago

Can we extract embeddings of size (say dim_text = 256, dim_image = 256) other than 512 from a pre-trained CLIP?