alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Apache License 2.0
674 stars 94 forks source link

TypeError: get_cpu_offload_context() missing 1 required positional argument: 'weight_offloading' #324

Closed ben-8878 closed 1 month ago

ben-8878 commented 1 month ago
./Pai-Megatron-Patch/megatron_patch/model/llava/clip_encoder.py:26: UserWarning: The cvcuda environment does not exist. Install cvcuda and use it
  warnings.warn("The cvcuda environment does not exist. Install cvcuda and use it")
Traceback (most recent call last):
  File "/home/ybZhang/miniconda3/envs/glm-m/lib/python3.8/site-packages/swift/cli/export.py", line 5, in <module>
    export_main()
  File "/home/ybZhang/miniconda3/envs/glm-m/lib/python3.8/site-packages/swift/utils/run_utils.py", line 32, in x_main
    result = llm_x(args, **kwargs)
  File "/home/ybZhang/miniconda3/envs/glm-m/lib/python3.8/site-packages/swift/llm/export.py", line 302, in llm_export
    convert_hf_to_megatron(model, extra_args, args.torch_dtype)
  File "/home/ybZhang/miniconda3/envs/glm-m/lib/python3.8/site-packages/swift/llm/megatron/convert.py", line 22, in convert_hf_to_megatron
    mg_model = model_provider()
  File "/home/ybZhang/miniconda3/envs/glm-m/lib/python3.8/site-packages/swift/llm/megatron/model.py", line 61, in model_provider
    model = gpt_model_cls(
  File "./Pai-Megatron-Patch/megatron_patch/model/qwen2/model.py", line 103, in __init__
    self.decoder = TransformerBlock(
  File "./Pai-Megatron-Patch/megatron_patch/model/qwen2/transformer_block.py", line 137, in __init__
    ) = get_cpu_offload_context(
TypeError: get_cpu_offload_context() missing 1 required positional argument: 'weight_offloading'
jerryli1981 commented 1 month ago

您好,这个我们在即将推理的llama3.1中解决了这个问题,然后接下来我们会把qwen2进行一次升级,让qwen2和llama3.1采用同样的底座,目前的情况是qwen2用的是240612,llama3.1用的是240718。升级后的qwen2可以支持所有的新特性包括flashattention-3

wning13 commented 1 month ago

@jerryli1981 请问目前有什么办法绕过这个问题吗