Closed aoke79 closed 8 months ago
V-RAM utilization:
generate.txt attached the code, I modified. please change the ext name from "txt" to "py"
I updated the bigdl to the latest one, this issue is gone. it's good! and Might you please help the time/token compute is correct or not? Thanks a lot,
currently the perf is good for me, might you please have a try, to confirm what I did? Thanks a ton log_code_1.1_model_1.5_0318_fp32_02.txt
I think the generation time of all tokens is correct. However, the generation time consists of first token and the rest token latency, which should better be benchmarked separately. You can refer to our benchmark to add a wrapper to print both first token and rest token performance.
Dear, I used llava release v1.1.0 version from https://github.com/haotian-liu/LLaVA/releases/tag/v1.1.0, and model v1.5-7b from https://huggingface.co/liuhaotian/llava-v1.5-7b, and follow the sample code https://github.com/intel-analytics/BigDL/tree/main/python/llm/example/GPU/PyTorch-Models/Model/llava to run it successfully on MTL GPU. it's cool, thanks very much. on the other hand, there are some issues here:
(env_bigdl_mtl) C:\AIGC\bigdl\BigDL-main\python\llm\example\GPU\PyTorch-Models\Model\llava\LLaVA-1.1.1>call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat" :: initializing oneAPI environment... Initializing Visual Studio command-line environment... Visual Studio version 17.8.4 environment configured. "C:\Program Files\Microsoft Visual Studio\2022\Community\" Visual Studio command-line environment initialized for: 'x64' : advisor -- latest : compiler -- latest : dal -- latest : debugger -- latest : dev-utilities -- latest : dnnl -- latest : dpcpp-ct -- latest : dpl -- latest : ipp -- latest : ippcp -- latest : mkl -- latest : tbb -- latest : vtune -- latest :: oneAPI environment initialized :: C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from
torchvision.io
, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you havelibjpeg
orlibpng
installed before buildingtorchvision
from source? warn( 2024-03-14 23:27:03,646 - INFO - intel_extension_for_pytorch auto imported Loading checkpoint shards: 100%|████████████████████████████████████| 2/2 [00:30<00:00, 15.32s/it] Some weights of the model checkpoint at openai/clip-vit-large-patch14-336 were not used when initializing CLIPVisionModel: ['text_model.encoder.layers.6.mlp.fc2.weight', 'text_model.encoder.layers.0.mlp.fc2.weight', 'text_model.encoder.layers.11.mlp.fc1.weight', 'text_model.encoder.layers.2.self_attn.v_proj.bias', 'text_model.encoder.layers.4.layer_norm2.weight', 'text_model.encoder.layers.4.mlp.fc2.bias', 'text_model.encoder.layers.11.mlp.fc2.weight', 'text_model.encoder.layers.9.mlp.fc1.bias', 'text_model.encoder.layers.11.self_attn.q_proj.weight', 'text_model.encoder.layers.3.mlp.fc1.weight', 'text_model.encoder.layers.9.self_attn.q_proj.weight', 'text_model.embeddings.position_embedding.weight', 'text_model.encoder.layers.1.mlp.fc1.bias', 'text_model.encoder.layers.3.layer_norm1.bias', 'text_model.encoder.layers.4.self_attn.q_proj.weight', 'text_model.encoder.layers.9.self_attn.v_proj.bias', 'text_model.encoder.layers.9.mlp.fc2.bias', 'text_model.encoder.layers.3.self_attn.out_proj.bias', 'text_model.encoder.layers.2.self_attn.k_proj.bias', 'text_model.encoder.layers.0.self_attn.q_proj.weight', 'text_model.encoder.layers.9.layer_norm1.weight', 'text_model.encoder.layers.2.self_attn.out_proj.bias', 'text_model.final_layer_norm.weight', 'text_model.encoder.layers.1.self_attn.out_proj.weight', 'text_model.encoder.layers.5.self_attn.k_proj.bias', 'text_model.encoder.layers.11.mlp.fc1.bias', 'text_model.encoder.layers.2.self_attn.v_proj.weight', 'text_model.encoder.layers.5.layer_norm1.weight', 'text_model.encoder.layers.4.self_attn.k_proj.weight', 'text_model.encoder.layers.2.layer_norm1.weight', 'text_model.encoder.layers.7.self_attn.out_proj.weight', 'text_model.encoder.layers.10.layer_norm1.weight', 'text_model.encoder.layers.8.mlp.fc2.weight', 'text_model.encoder.layers.4.self_attn.out_proj.weight', 'text_model.encoder.layers.4.mlp.fc2.weight', 'text_model.encoder.layers.8.self_attn.k_proj.weight', 'text_model.encoder.layers.7.self_attn.q_proj.bias', 'text_model.encoder.layers.6.self_attn.q_proj.bias', 'text_model.encoder.layers.1.layer_norm2.weight', 'text_model.encoder.layers.10.self_attn.k_proj.bias', 'text_model.encoder.layers.8.self_attn.k_proj.bias', 'text_model.encoder.layers.7.layer_norm2.bias', 'text_model.encoder.layers.7.mlp.fc1.weight', 'text_model.encoder.layers.9.self_attn.v_proj.weight', 'text_model.encoder.layers.9.mlp.fc1.weight', 'text_model.encoder.layers.1.mlp.fc1.weight', 'text_model.encoder.layers.10.self_attn.v_proj.bias', 'text_model.encoder.layers.9.layer_norm1.bias', 'text_model.encoder.layers.8.layer_norm1.weight', 'text_model.encoder.layers.3.self_attn.q_proj.bias', 'text_model.encoder.layers.6.mlp.fc1.weight', 'text_model.encoder.layers.6.layer_norm2.weight', 'text_model.encoder.layers.3.mlp.fc2.weight', 'text_model.encoder.layers.6.layer_norm1.bias', 'text_model.encoder.layers.6.self_attn.k_proj.weight', 'text_model.encoder.layers.6.self_attn.q_proj.weight', 'text_model.encoder.layers.3.mlp.fc1.bias', 'text_model.encoder.layers.6.self_attn.out_proj.bias', 'text_model.encoder.layers.9.self_attn.k_proj.bias', 'text_model.encoder.layers.11.layer_norm2.weight', 'text_model.encoder.layers.5.layer_norm1.bias', 'text_model.encoder.layers.7.self_attn.q_proj.weight', 'text_model.encoder.layers.1.layer_norm2.bias', 'text_model.encoder.layers.11.self_attn.v_proj.weight', 'text_model.encoder.layers.2.mlp.fc2.bias', 'text_model.encoder.layers.1.mlp.fc2.bias', 'text_model.encoder.layers.3.self_attn.q_proj.weight', 'text_model.encoder.layers.0.self_attn.k_proj.weight', 'text_model.encoder.layers.4.layer_norm2.bias', 'text_model.encoder.layers.10.mlp.fc1.bias', 'text_model.encoder.layers.4.mlp.fc1.bias', 'text_model.encoder.layers.4.self_attn.q_proj.bias', 'text_model.encoder.layers.10.layer_norm1.bias', 'text_model.encoder.layers.2.layer_norm2.weight', 'text_model.encoder.layers.2.layer_norm2.bias', 'text_model.encoder.layers.7.mlp.fc1.bias', 'text_model.encoder.layers.11.self_attn.out_proj.weight', 'text_model.encoder.layers.10.self_attn.q_proj.bias', 'text_model.encoder.layers.11.layer_norm1.bias', 'text_model.encoder.layers.9.layer_norm2.weight', 'text_model.encoder.layers.9.self_attn.out_proj.weight', 'text_model.encoder.layers.1.self_attn.q_proj.bias', 'text_model.encoder.layers.5.self_attn.out_proj.weight', 'text_model.encoder.layers.8.layer_norm2.bias', 'text_model.encoder.layers.7.self_attn.k_proj.bias', 'text_model.encoder.layers.10.mlp.fc1.weight', 'text_model.encoder.layers.3.mlp.fc2.bias', 'text_model.encoder.layers.4.self_attn.v_proj.weight', 'text_model.encoder.layers.6.mlp.fc1.bias', 'text_model.encoder.layers.0.layer_norm1.weight', 'text_model.encoder.layers.11.self_attn.q_proj.bias', 'text_model.encoder.layers.5.layer_norm2.bias', 'text_model.encoder.layers.5.layer_norm2.weight', 'visual_projection.weight', 'text_model.encoder.layers.6.self_attn.v_proj.bias', 'text_model.encoder.layers.7.self_attn.v_proj.weight', 'text_model.encoder.layers.8.self_attn.out_proj.weight', 'text_model.encoder.layers.8.self_attn.q_proj.bias', 'text_projection.weight', 'text_model.encoder.layers.0.self_attn.k_proj.bias', 'text_model.encoder.layers.9.self_attn.k_proj.weight', 'text_model.encoder.layers.1.self_attn.q_proj.weight', 'text_model.encoder.layers.8.self_attn.q_proj.weight', 'text_model.encoder.layers.0.self_attn.out_proj.bias', 'text_model.encoder.layers.8.mlp.fc1.bias', 'text_model.encoder.layers.3.layer_norm1.weight', 'text_model.encoder.layers.3.self_attn.out_proj.weight', 'text_model.encoder.layers.2.mlp.fc1.bias', 'text_model.encoder.layers.3.layer_norm2.bias', 'text_model.encoder.layers.10.self_attn.k_proj.weight', 'text_model.encoder.layers.4.layer_norm1.weight', 'text_model.encoder.layers.10.layer_norm2.weight', 'text_model.encoder.layers.6.layer_norm1.weight', 'text_model.encoder.layers.5.self_attn.out_proj.bias', 'text_model.encoder.layers.4.layer_norm1.bias', 'text_model.encoder.layers.8.layer_norm2.weight', 'text_model.encoder.layers.9.self_attn.out_proj.bias', 'text_model.encoder.layers.1.layer_norm1.weight', 'text_model.encoder.layers.1.self_attn.k_proj.bias', 'text_model.encoder.layers.8.mlp.fc2.bias', 'text_model.encoder.layers.8.self_attn.v_proj.weight', 'text_model.encoder.layers.7.mlp.fc2.bias', 'text_model.encoder.layers.1.mlp.fc2.weight', 'text_model.encoder.layers.6.layer_norm2.bias', 'text_model.encoder.layers.2.mlp.fc1.weight', 'text_model.encoder.layers.11.mlp.fc2.bias', 'text_model.encoder.layers.0.layer_norm2.weight', 'text_model.encoder.layers.6.self_attn.out_proj.weight', 'text_model.encoder.layers.6.self_attn.v_proj.weight', 'text_model.encoder.layers.0.layer_norm2.bias', 'text_model.encoder.layers.0.self_attn.v_proj.bias', 'text_model.encoder.layers.1.layer_norm1.bias', 'text_model.final_layer_norm.bias', 'text_model.encoder.layers.3.self_attn.v_proj.weight', 'text_model.encoder.layers.8.self_attn.out_proj.bias', 'text_model.encoder.layers.11.self_attn.k_proj.bias', 'text_model.encoder.layers.5.mlp.fc1.weight', 'text_model.encoder.layers.10.self_attn.out_proj.bias', 'text_model.encoder.layers.7.layer_norm1.weight', 'text_model.encoder.layers.5.self_attn.v_proj.bias', 'text_model.encoder.layers.2.self_attn.q_proj.bias', 'text_model.encoder.layers.3.self_attn.k_proj.weight', 'text_model.encoder.layers.8.mlp.fc1.weight', 'text_model.encoder.layers.8.self_attn.v_proj.bias', 'text_model.encoder.layers.4.self_attn.v_proj.bias', 'text_model.encoder.layers.10.self_attn.q_proj.weight', 'text_model.encoder.layers.5.mlp.fc2.bias', 'text_model.encoder.layers.11.layer_norm2.bias', 'text_model.encoder.layers.5.self_attn.v_proj.weight', 'text_model.encoder.layers.4.self_attn.k_proj.bias', 'text_model.encoder.layers.3.self_attn.k_proj.bias', 'text_model.encoder.layers.4.self_attn.out_proj.bias', 'text_model.encoder.layers.9.self_attn.q_proj.bias', 'text_model.encoder.layers.6.self_attn.k_proj.bias', 'text_model.encoder.layers.2.self_attn.k_proj.weight', 'text_model.encoder.layers.5.self_attn.k_proj.weight', 'text_model.encoder.layers.11.self_attn.k_proj.weight', 'text_model.encoder.layers.0.mlp.fc2.bias', 'text_model.encoder.layers.0.self_attn.out_proj.weight', 'logit_scale', 'text_model.encoder.layers.6.mlp.fc2.bias', 'text_model.encoder.layers.0.mlp.fc1.weight', 'text_model.encoder.layers.5.mlp.fc2.weight', 'text_model.encoder.layers.2.layer_norm1.bias', 'text_model.encoder.layers.7.mlp.fc2.weight', 'text_model.encoder.layers.10.mlp.fc2.bias', 'text_model.encoder.layers.4.mlp.fc1.weight', 'text_model.encoder.layers.2.mlp.fc2.weight', 'text_model.encoder.layers.10.layer_norm2.bias', 'text_model.encoder.layers.11.layer_norm1.weight', 'text_model.embeddings.position_ids', 'text_model.encoder.layers.7.self_attn.k_proj.weight', 'text_model.encoder.layers.5.self_attn.q_proj.weight', 'text_model.encoder.layers.2.self_attn.q_proj.weight', 'text_model.encoder.layers.0.self_attn.q_proj.bias', 'text_model.encoder.layers.7.self_attn.out_proj.bias', 'text_model.encoder.layers.5.mlp.fc1.bias', 'text_model.encoder.layers.11.self_attn.v_proj.bias', 'text_model.encoder.layers.0.self_attn.v_proj.weight', 'text_model.encoder.layers.10.self_attn.v_proj.weight', 'text_model.encoder.layers.0.mlp.fc1.bias', 'text_model.encoder.layers.8.layer_norm1.bias', 'text_model.encoder.layers.9.mlp.fc2.weight', 'text_model.encoder.layers.11.self_attn.out_proj.bias', 'text_model.encoder.layers.3.self_attn.v_proj.bias', 'text_model.encoder.layers.0.layer_norm1.bias', 'text_model.encoder.layers.1.self_attn.out_proj.bias', 'text_model.encoder.layers.1.self_attn.k_proj.weight', 'text_model.embeddings.token_embedding.weight', 'text_model.encoder.layers.7.layer_norm2.weight', 'text_model.encoder.layers.1.self_attn.v_proj.weight', 'text_model.encoder.layers.5.self_attn.q_proj.bias', 'text_model.encoder.layers.10.self_attn.out_proj.weight', 'text_model.encoder.layers.7.self_attn.v_proj.bias', 'text_model.encoder.layers.1.self_attn.v_proj.bias', 'text_model.encoder.layers.7.layer_norm1.bias', 'text_model.encoder.layers.10.mlp.fc2.weight', 'text_model.encoder.layers.2.self_attn.out_proj.weight', 'text_model.encoder.layers.9.layer_norm2.bias', 'text_model.encoder.layers.3.layer_norm2.weight']-------------------- Status is -------------------- INFO: input token count: 51 INFO: output token count: 114 INFO: token Time (ms): 176.53 -------------------- end -------------------- ASSISTANT: Yes, this painting was created by Leonardo da Vinci. This Renaissance-era artwork is known as a portrait of a woman, often referred to as the Mona Lisa. It is renowned for its unique smile, captured on a woman's face, which has contributed to making the image iconic. The painting features the face of the woman and does not show her figure, as is the case in the scene presented in the image. The style of the painting is characterized by the high contrast and realistic portrayal of the subject. USER: Can you describe this painting? INFO: tokenizer image Time (ms): 3.88 The painting is an iconic portrait of a woman, which has been widely celebrated and cherished over the years due to its captivating subject matter and the artist's unrivaled skill. It features a close-up shot of the woman's face, with a focus on her expression. Her enigmatic and serene smile has become synonymous with the painting itself.
The artist's style can be characterized by the use of high contrast and fine attention to detail, which allows him to capture the nuances of the subject's features. The realism of the woman's appearance in the painting further highlights the artist's skill and talent.
In terms of the content, the image does not show the woman's figure, and instead, it showcases just the woman's face and expression. This adds a sense of intimacy and depth to the painting, which is why it has become a recognizable and enduring piece of art.
-------------------- Status is -------------------- INFO: input token count: 179 INFO: output token count: 205 INFO: token Time (ms): 103.64 -------------------- end -------------------- ASSISTANT: The painting is an iconic portrait of a woman, which has been widely celebrated and cherished over the years due to its captivating subject matter and the artist's unrivaled skill. It features a close-up shot of the woman's face, with a focus on her expression. Her enigmatic and serene smile has become synonymous with the painting itself.
The artist's style can be characterized by the use of high contrast and fine attention to detail, which allows him to capture the nuances of the subject's features. The realism of the woman's appearance in the painting further highlights the artist's skill and talent.
In terms of the content, the image does not show the woman's figure, and instead, it showcases just the woman's face and expression. This adds a sense of intimacy and depth to the painting, which is why it has become a recognizable and enduring piece of art. USER: what should I know about the painting? INFO: tokenizer image Time (ms): 4.60 The painting in question is a world-famous portrait, made by the prominent Italian artist Leonardo da Vinci during the Renaissance era. The artwork, often referred to as La Gioconda or the Mona Lisa, has become one of the most iconic depictions of a person and is recognized globally due to its historical, artistic, and cultural significance. The painting originally started as a charcoal sketch, and later it was developed into an oil painting.
The image showcases a close-up of the painting itself, focusing on the woman's face and expression. The woman's serene smile and her enigmatic gaze have become synonymous with the portrait and have contributed to its enduring popularity. The painting is notable for its detailed representation of the subject, and the artist's mastery of shading and contrasting techniques.
The portrait has been the subject of numerous discussions, interpretations, and even various theories surrounding the identity of the woman depicted. Despite these debates and multiple reproductions, the original, still surviving artwork continues to captivate viewers and remains an essential piece of global art history.
-------------------- Status is -------------------- INFO: input token count: 400 INFO: output token count: 249 INFO: token Time (ms): 117.18 -------------------- end -------------------- ASSISTANT: The painting in question is a world-famous portrait, made by the prominent Italian artist Leonardo da Vinci during the Renaissance era. The artwork, often referred to as La Gioconda or the Mona Lisa, has become one of the most iconic depictions of a person and is recognized globally due to its historical, artistic, and cultural significance. The painting originally started as a charcoal sketch, and later it was developed into an oil painting.
The image showcases a close-up of the painting itself, focusing on the woman's face and expression. The woman's serene smile and her enigmatic gaze have become synonymous with the portrait and have contributed to its enduring popularity. The painting is notable for its detailed representation of the subject, and the artist's mastery of shading and contrasting techniques.
The portrait has been the subject of numerous discussions, interpretations, and even various theories surrounding the identity of the woman depicted. Despite these debates and multiple reproductions, the original, still surviving artwork continues to captivate viewers and remains an essential piece of global art history. USER: is there any other paintings like this? INFO: tokenizer image Time (ms): 5.16 Yes, there are several similar paintings in terms of the subject matter but with varying executions and styles. Here are some notable examples:
These and other paintings share elements such as the focused subjects, unique expressions, and even portrayals of historical or mythological figures. However, without more context, it is difficult to ascertain the precise connection or relation between these pieces of art. Nonetheless, they display similarities in terms of their themes and interpretations, making them notable works Traceback (most recent call last): File "C:\AIGC\bigdl\BigDL-main\python\llm\example\GPU\PyTorch-Models\Model\llava\LLaVA-1.1.1\generate.py", line 342, in
output_ids = model.generate(
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, kwargs)
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\transformers\generation\utils.py", line 1485, in generate
return self.sample(
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\transformers\generation\utils.py", line 2524, in sample
outputs = self(
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(args, kwargs)
File "C:\AIGC\bigdl\BigDL-main\python\llm\example\GPU\PyTorch-Models\Model\llava\LLaVA-1.1.1\llava\model\language_model\llava_llama.py", line 78, in forward
outputs = self.model(
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, kwargs)
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, *kwargs)
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\transformers\models\llama\modeling_llama.py", line 577, in forward
layer_outputs = decoder_layer(
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, kwargs)
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\transformers\models\llama\modeling_llama.py", line 292, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(args, kwargs)
File "C:\ProgramData\anaconda3\envs\env_bigdl_mtl\lib\site-packages\transformers\models\llama\modeling_llama.py", line 227, in forward
attn_weights = attn_weights + attention_mask
RuntimeError: Native API failed. Native API returns: -999 (Unknown PI error) -999 (Unknown PI error)
(env_bigdl_mtl) C:\AIGC\bigdl\BigDL-main\python\llm\example\GPU\PyTorch-Models\Model\llava\LLaVA-1.1.1>