PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
BSD 3-Clause "New" or "Revised" License
4.84k
stars
646
forks
source link
What key design processes make BLIP have no limitations on text input length compared to CLIP? #221
Open
Yang-bug-star opened 1 month ago