Closed holytony closed 5 months ago
Can you show the terminal output when running the demo? It might be caused by wrongly formatted output of LLM.
this is what I got on the terminal
python demo_gradio.py Loading checkpoint shards: 0%| | Loading checkpoint shards: 50%|███████████████████████████████████████████████████████████ | 1/2 [00:Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:04<00:00, 2.05s/it] Some weights of the model checkpoint at models/shakechen/Llama-2-7b-hf were not used when initializing LlamaForCausalLM: ['model.layers.24.self_attn.rotary_emb.inv_freq', 'model.layers.27.self_attn.rotary_emb.inv_freq', 'model.layers.13.self_attn.rotary_emb.inv_freq', 'model.layers.11.self_attn.rotary_emb.inv_freq', 'model.layers.9.self_attn.rotary_emb.inv_freq', 'model.layers.30.self_attn.rotary_emb.inv_freq', 'model.layers.2.self_attn.rotary_emb.inv_freq', 'model.layers.19.self_attn.rotary_emb.inv_freq', 'model.layers.17.self_attn.rotary_emb.inv_freq', 'model.layers.8.self_attn.rotary_emb.inv_freq', 'model.layers.4.self_attn.rotary_emb.inv_freq', 'model.layers.22.self_attn.rotary_emb.inv_freq', 'model.layers.20.self_attn.rotary_emb.inv_freq', 'model.layers.15.self_attn.rotary_emb.inv_freq', 'model.layers.10.self_attn.rotary_emb.inv_freq', 'model.layers.18.self_attn.rotary_emb.inv_freq', 'model.layers.12.self_attn.rotary_emb.inv_freq', 'model.layers.25.self_attn.rotary_emb.inv_freq', 'model.layers.29.self_attn.rotary_emb.inv_freq', 'model.layers.1.self_attn.rotary_emb.inv_freq', 'model.layers.5.self_attn.rotary_emb.inv_freq', 'model.layers.6.self_attn.rotary_emb.inv_freq', 'model.layers.31.self_attn.rotary_emb.inv_freq', 'model.layers.3.self_attn.rotary_emb.inv_freq', 'model.layers.21.self_attn.rotary_emb.inv_freq', 'model.layers.7.self_attn.rotary_emb.inv_freq', 'model.layers.28.self_attn.rotary_emb.inv_freq', 'model.layers.14.self_attn.rotary_emb.inv_freq', 'model.layers.16.self_attn.rotary_emb.inv_freq', 'model.layers.26.self_attn.rotary_emb.inv_freq', 'model.layers.0.self_attn.rotary_emb.inv_freq', 'model.layers.23.self_attn.rotary_emb.inv_freq']
- This IS expected if you are initializing LlamaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing LlamaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). No module 'xformers'. Proceeding without it. RanniLDM: Running in v-prediction mode DiffusionWrapper has 865.91 M params. /root//Ranni/ldm/models/diffusion/ddpm.py:165: RuntimeWarning: divide by zero encountered in divide self.register_buffer('sqrt_recip_alphas_cumprod', to_torch(np.sqrt(1. / alphas_cumprod))) /root//Ranni/ldm/models/diffusion/ddpm.py:166: RuntimeWarning: divide by zero encountered in divide self.register_buffer('sqrt_recipm1_alphas_cumprod', to_torch(np.sqrt(1. / alphas_cumprod - 1))) making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels /root//Ranni/demo_gradio.py:313: GradioDeprecationWarning: The
style
method is deprecated. Please set these arguments in the constructor instead. mid_result_gallery = gr.Gallery(label='Output', show_label=True, elem_id="gallery").style(grid=1, height='auto') /root//Ranni/demo_gradio.py:313: GradioDeprecationWarning: The 'grid' parameter will be deprecated. Please use 'columns' in the constructor instead. mid_result_gallery = gr.Gallery(label='Output', show_label=True, elem_id="gallery").style(grid=1, height='auto') /root//Ranni/demo_gradio.py:318: GradioDeprecationWarning: Thestyle
method is deprecated. Please set these arguments in the constructor instead. result_gallery = gr.Gallery(label='Output', show_label=True, elem_id="gallery").style(grid=1, height='auto') /root//Ranni/demo_gradio.py:318: GradioDeprecationWarning: The 'grid' parameter will be deprecated. Please use 'columns' in the constructor instead. result_gallery = gr.Gallery(label='Output', show_label=True, elem_id="gallery").style(grid=1, height='auto') Running on local URL: http://0.0.0.0:7860To create a public link, set
share=True
inlaunch()
. Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:17<00:00, 2.81it/s] [{'label': '4k image, best quality, extremely detailed', 'box': [384, 384, 768, 768]}] [{'label': '4k image, best quality, extremely detailed', 'box': [384, 384, 768, 768]}] Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00, 3.07it/s] [{'label': textbox, 'box': [384, 384, 768, 768]}] [{'label': '4k image, best quality, extremely detailed', 'box': [384, 384, 768, 768]}] Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00, 3.07it/s] Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00, 3.07it/s] Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00, 3.07it/s] [{'label': textbox, 'box': [384, 384, 768, 768]}] [{'label': '4k image, best quality, extremely detailed', 'box': [384, 384, 768, 768]}] Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00, 3.06it/s] Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00, 3.00it/s]
Try to check the gradio version and make sure to run the demo with correct order, i.e. text-to-panel
and then panel-to-image
.
Try to check the gradio version and make sure to run the demo with correct order, i.e.
text-to-panel
and thenpanel-to-image
.
Turn out I have modified the demo_gradio.py, I commented out original code, then used the code you guys commented "local", and moved the llama 7b I have locally to there, obviously that caused the whole black image, no bbox, no box answer issue I was facing,
after I used the original code which is downloads the code directly from huggingface using the original code, everything is fine now.
however, I'm still curious what could be the reason behind the problem I have encountered using the "local " code in demo_gradio
It seems the local llama you used is not the chat version.
It seems the local llama you used is not the chat version.
that might be the case, I will try again later
after run the demo and select an example, this is what I got: