Open zhutao1221 opened 4 months ago
按照教程,torch_musa是可以在S80上跑起来transformers的。
教程在哪
教程在哪
口误。教程还没出来。不过就按常规的transofmer库里的Pipeline用法,指定device为musa即可,已Gemma为例:
device = "musa"
data_type = torch.half
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b")
model = AutoModelForCausalLM.from_pretrained(
gemma_2b_path, device_map=device, torch_dtype=data_type)
需要先import torch_musa吗?还是transofmer库已经添加了对musa的支持?
口误。教程还没出来。不过就按常规的transofmer库里的Pipeline用法,指定device为musa即可,已Gemma为例:
device = "musa" data_type = torch.half tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b") model = AutoModelForCausalLM.from_pretrained( gemma_2b_path, device_map=device, torch_dtype=data_type)
需要先import torch_musa吗?还是transofmer库已经添加了对musa的支持?
口误。教程还没出来。不过就按常规的transofmer库里的Pipeline用法,指定device为musa即可,已Gemma为例:
device = "musa" data_type = torch.half tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b") model = AutoModelForCausalLM.from_pretrained( gemma_2b_path, device_map=device, torch_dtype=data_type)
要先Import
教程在哪
口误。教程还没出来。不过就按常规的transofmer库里的Pipeline用法,指定device为musa即可,已Gemma为例:
device = "musa" data_type = torch.half tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b") model = AutoModelForCausalLM.from_pretrained( gemma_2b_path, device_map=device, torch_dtype=data_type)
按照这个方法做会有如下错误:
gemma-2b: /home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/transformers/generation/utils.py:1141: UserWarning: Using the model-agnostic default max_length
(=20) to control the generation length. We recommend setting max_new_tokens
to control the maximum length of the generation.
warnings.warn(
Traceback (most recent call last):
File "/home/jorden/Softwares/ml/llm/ChatGLM3/basic_demo/test-gemma.py", line 65, in
我的运行环境是Ubuntu20.04, MT s80, python3.10.13, mthreads-gmi:1.9.0 Driver Version:2.6.0
教程在哪
口误。教程还没出来。不过就按常规的transofmer库里的Pipeline用法,指定device为musa即可,已Gemma为例:
device = "musa" data_type = torch.half tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b") model = AutoModelForCausalLM.from_pretrained( gemma_2b_path, device_map=device, torch_dtype=data_type)
按照这个方法做会有如下错误: gemma-2b: /home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/transformers/generation/utils.py:1141: UserWarning: Using the model-agnostic default
max_length
(=20) to control the generation length. We recommend settingmax_new_tokens
to control the maximum length of the generation. warnings.warn( Traceback (most recent call last): File "/home/jorden/Softwares/ml/llm/ChatGLM3/basic_demo/test-gemma.py", line 65, in main() File "/home/jorden/Softwares/ml/llm/ChatGLM3/basic_demo/test-gemma.py", line 59, in main outputs = model.generate(inputs_ids) File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/transformers/generation/utils.py", line 1576, in generate result = self._greedy_search( File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/transformers/generation/utils.py", line 2494, in _greedy_search outputs = self( File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/transformers/models/gemma/modeling_gemma.py", line 1118, in forward outputs = self.model( File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/transformers/models/gemma/modeling_gemma.py", line 926, in forward layer_outputs = decoder_layer( File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/transformers/models/gemma/modeling_gemma.py", line 646, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/transformers/models/gemma/modeling_gemma.py", line 268, in forward cos, sin = self.rotary_emb(value_states, position_ids, seq_len=None) File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/transformers/models/gemma/modeling_gemma.py", line 122, in forward with torch.autocast(device_type=device_type, enabled=False): File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 201, in init** raise RuntimeError('User specified autocast device_type must be 'cuda' or 'cpu'') RuntimeError: User specified autocast device_type must be 'cuda' or 'cpu'我的运行环境是Ubuntu20.04, MT s80, python3.10.13, mthreads-gmi:1.9.0 Driver Version:2.6.0
由于gemma里最新的huggingface使用了amp,请在上面 File "/home/jorden/Softwares/anaconda3/envs/mt/lib/python3.10/site-packages/transformers/models/gemma/modeling_gemma.py", line 122, in forward中替换为:
with torch.musa.amp.autocast
试试
在摩尔线程s80上,如何使用transformers库来运行大模型?