Open RainyLayx opened 1 year ago
When I input the text prompt 'cat with classes.',I wanna get the cat which wear glasses,but the model frames the cat and the glasses as two object. How to deal with it?...
You can try this demo here and set the specific --token_span
for cat with glasses
to see if it can bring you a better result.
I tried to do like this, but nothing was detected...
I tried to do like this, but nothing was detected...
Have you ever tried to lower the threshold to see the results~
0.9 0.1 都试过了,结果都是检测不到任何目标。我在想换用理解力更强的text backbone是否能解决。
When I input the text prompt 'cat with classes.',I wanna get the cat which wear glasses,but the model frames the cat and the glasses as two object. How to deal with it?...
You can try this demo here and set the specific
--token_span
forcat with glasses
to see if it can bring you a better result.
The model use the hidden state feature of each text tokens. It is different with CLIP which uses hidden state feature of EOS token.
When I input the text prompt 'cat with classes.',I wanna get the cat which wear glasses,but the model frames the cat and the glasses as two object. How to deal with it?...