showlab / VisorGPT

[NeurIPS 2023] Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT
MIT License
129 stars 2 forks source link

What does the generated_sentence.txt generated after training represent? #3

Open Crd1140234468 opened 10 months ago

Crd1140234468 commented 10 months ago

Hi, I followed the steps you provided for 200,000 steps training. When I used the inference test results, the generated_sentence.txt I got was different from the Output sequence shown in the paper. When I write "box; multiple instances; medium; 4; 0; apple, apple, cake, knife;" in beginning.txt, I get "[CLS] box; multiple instances; medium; 4; 0; apple, apple, cake, knife; [ ] 176 ymin 188 xmax 236 ymax 426 ] [SEP] banana xmin 112 ymin 181 xmax 167 ymax 429 ] [SEP] ##r xmin 138 ymin 189 xmax 180 ymax 427 ] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] cell phone xmin 83 ymin 197 xmax 143 ymax 448 ] [SEP] [SEP] [SEP] 94 ymin 202 xmax 139 ymax 422 ] [SEP] [SEP] [SEP] [SEP] [SEP] [ SEP] xmin 144 ymin 182 xmax 230 ymax 420 ] [SEP] [SEP] [SEP] 185 ] [SEP] [SEP] [ xmin [SEP] [SEP] [SEP] [SEP] [SEP] [SEP] xmin . ...", what does [SEP] here mean?

Sierkinhane commented 10 months ago

Hello, could you please generate more results using the following prompt: "box; multiple instances; medium; 4; 0; apple, apple, cake, knife;"? The models can usually generate sequences following this format, but there might be occasional failures.

Crd1140234468 commented 10 months ago

Thanks for your reply, I have figured this out