Hi! I am interested in looking into the text-image relationship of the finetuned CLIP in stable-diffusion-2-1-unclip. But seems like the text encoder included in the checkpoint doesn't have the final projection layer. I was wondering if weight to that layer is available anywhere? Thanks!
Hi! I am interested in looking into the text-image relationship of the finetuned CLIP in stable-diffusion-2-1-unclip. But seems like the text encoder included in the checkpoint doesn't have the final projection layer. I was wondering if weight to that layer is available anywhere? Thanks!