Hi! @Seokju-Cho Thanks for your inspiring work!
You mentioned in your paper that the text-encoder accepts input like "A photo of a {class}", I'm wondering if I want to train your model on my own dataset which only offers fully-supervised label(pixel-wise label),how can I make it?
Thanks in advance.
Best,
Zhirui
Hi! @Seokju-Cho Thanks for your inspiring work! You mentioned in your paper that the text-encoder accepts input like
"A photo of a {class}"
, I'm wondering if I want to train your model on my own dataset which only offers fully-supervised label(pixel-wise label),how can I make it? Thanks in advance. Best, Zhirui