all black when running the text to panel

holytony commented 5 months ago

after run the demo and select an example, this is what I got:

thss15fyt commented 5 months ago

Can you show the terminal output when running the demo? It might be caused by wrongly formatted output of LLM.

holytony commented 5 months ago

this is what I got on the terminal

python demo_gradio.py Loading checkpoint shards: 0%| | Loading checkpoint shards: 50%|███████████████████████████████████████████████████████████ | 1/2 [00:Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:04<00:00, 2.05s/it] Some weights of the model checkpoint at models/shakechen/Llama-2-7b-hf were not used when initializing LlamaForCausalLM: ['model.layers.24.self_attn.rotary_emb.inv_freq', 'model.layers.27.self_attn.rotary_emb.inv_freq', 'model.layers.13.self_attn.rotary_emb.inv_freq', 'model.layers.11.self_attn.rotary_emb.inv_freq', 'model.layers.9.self_attn.rotary_emb.inv_freq', 'model.layers.30.self_attn.rotary_emb.inv_freq', 'model.layers.2.self_attn.rotary_emb.inv_freq', 'model.layers.19.self_attn.rotary_emb.inv_freq', 'model.layers.17.self_attn.rotary_emb.inv_freq', 'model.layers.8.self_attn.rotary_emb.inv_freq', 'model.layers.4.self_attn.rotary_emb.inv_freq', 'model.layers.22.self_attn.rotary_emb.inv_freq', 'model.layers.20.self_attn.rotary_emb.inv_freq', 'model.layers.15.self_attn.rotary_emb.inv_freq', 'model.layers.10.self_attn.rotary_emb.inv_freq', 'model.layers.18.self_attn.rotary_emb.inv_freq', 'model.layers.12.self_attn.rotary_emb.inv_freq', 'model.layers.25.self_attn.rotary_emb.inv_freq', 'model.layers.29.self_attn.rotary_emb.inv_freq', 'model.layers.1.self_attn.rotary_emb.inv_freq', 'model.layers.5.self_attn.rotary_emb.inv_freq', 'model.layers.6.self_attn.rotary_emb.inv_freq', 'model.layers.31.self_attn.rotary_emb.inv_freq', 'model.layers.3.self_attn.rotary_emb.inv_freq', 'model.layers.21.self_attn.rotary_emb.inv_freq', 'model.layers.7.self_attn.rotary_emb.inv_freq', 'model.layers.28.self_attn.rotary_emb.inv_freq', 'model.layers.14.self_attn.rotary_emb.inv_freq', 'model.layers.16.self_attn.rotary_emb.inv_freq', 'model.layers.26.self_attn.rotary_emb.inv_freq', 'model.layers.0.self_attn.rotary_emb.inv_freq', 'model.layers.23.self_attn.rotary_emb.inv_freq']

This IS expected if you are initializing LlamaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).

This IS NOT expected if you are initializing LlamaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). No module 'xformers'. Proceeding without it. RanniLDM: Running in v-prediction mode DiffusionWrapper has 865.91 M params. /root//Ranni/ldm/models/diffusion/ddpm.py:165: RuntimeWarning: divide by zero encountered in divide self.register_buffer('sqrt_recip_alphas_cumprod', to_torch(np.sqrt(1. / alphas_cumprod))) /root//Ranni/ldm/models/diffusion/ddpm.py:166: RuntimeWarning: divide by zero encountered in divide self.register_buffer('sqrt_recipm1_alphas_cumprod', to_torch(np.sqrt(1. / alphas_cumprod - 1))) making attention of type 'vanilla' with 512 in_channels Working with z of shape (1, 4, 32, 32) = 4096 dimensions. making attention of type 'vanilla' with 512 in_channels /root//Ranni/demo_gradio.py:313: GradioDeprecationWarning: The style method is deprecated. Please set these arguments in the constructor instead. mid_result_gallery = gr.Gallery(label='Output', show_label=True, elem_id="gallery").style(grid=1, height='auto') /root//Ranni/demo_gradio.py:313: GradioDeprecationWarning: The 'grid' parameter will be deprecated. Please use 'columns' in the constructor instead. mid_result_gallery = gr.Gallery(label='Output', show_label=True, elem_id="gallery").style(grid=1, height='auto') /root//Ranni/demo_gradio.py:318: GradioDeprecationWarning: The style method is deprecated. Please set these arguments in the constructor instead. result_gallery = gr.Gallery(label='Output', show_label=True, elem_id="gallery").style(grid=1, height='auto') /root//Ranni/demo_gradio.py:318: GradioDeprecationWarning: The 'grid' parameter will be deprecated. Please use 'columns' in the constructor instead. result_gallery = gr.Gallery(label='Output', show_label=True, elem_id="gallery").style(grid=1, height='auto') Running on local URL: http://0.0.0.0:7860

To create a public link, set share=True in launch(). Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:17<00:00, 2.81it/s] [{'label': '4k image, best quality, extremely detailed', 'box': [384, 384, 768, 768]}] [{'label': '4k image, best quality, extremely detailed', 'box': [384, 384, 768, 768]}] Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00, 3.07it/s] [{'label': textbox, 'box': [384, 384, 768, 768]}] [{'label': '4k image, best quality, extremely detailed', 'box': [384, 384, 768, 768]}] Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00, 3.07it/s] Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00, 3.07it/s] Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00, 3.07it/s] [{'label': textbox, 'box': [384, 384, 768, 768]}] [{'label': '4k image, best quality, extremely detailed', 'box': [384, 384, 768, 768]}] Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00, 3.06it/s] Data shape for DDIM sampling is (1, 4, 96, 96), eta 0.0 Running DDIM Sampling with 50 timesteps DDIM Sampler: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:16<00:00, 3.00it/s]

thss15fyt commented 5 months ago

Try to check the gradio version and make sure to run the demo with correct order, i.e. text-to-panel and then panel-to-image.

holytony commented 5 months ago

Try to check the gradio version and make sure to run the demo with correct order, i.e. text-to-panel and then panel-to-image.

Turn out I have modified the demo_gradio.py, I commented out original code, then used the code you guys commented "local", and moved the llama 7b I have locally to there, obviously that caused the whole black image, no bbox, no box answer issue I was facing,

after I used the original code which is downloads the code directly from huggingface using the original code, everything is fine now.

however, I'm still curious what could be the reason behind the problem I have encountered using the "local " code in demo_gradio

thss15fyt commented 5 months ago

It seems the local llama you used is not the chat version.

holytony commented 5 months ago

It seems the local llama you used is not the chat version.

that might be the case, I will try again later

ali-vilab / Ranni

all black when running the text to panel #13