Closed Lavenderlyu closed 2 months ago
不需要转换,要保证:
感谢您,请问python的版本是多少?
3.8
非常谢谢您!目前我的环境不支持bitsandbytes0.42.0以上的版本,其他版本全部相同 (llama3_ft) [xylv@server01 script]$ pip install bitsandbytes-0.43.1-py3-none-manylinux_2_24_x86_64.whl WARNING: Requirement 'bitsandbytes-0.43.1-py3-none-manylinux_2_24_x86_64.whl' looks like a filename, but the file does not exist ERROR: bitsandbytes-0.43.1-py3-none-manylinux_2_24_x86_64.whl is not a supported wheel on this platform. (llama3_ft) [xylv@server01 script]$ python -V Python 3.8.19
cpu optimizer不支持gcc5以下版本 实验室服务器无法变更gcc版本,于是取消cpu optimizer 遇到了以下的问题:
将ds_config_zero3中的"zero_optimization"修改了"offload_optimizer": {"device": "none"},之后报错如下:
/home/xylv/anaconda3/envs/llama3_ft/lib/python3.8/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
[2024-06-21 22:43:27,797] [WARNING] [comm.py:152:init_deepspeed_backend] NCCL backend in DeepSpeed not yet implemented
[2024-06-21 22:43:27,797] [INFO] [comm.py:594:init_distributed] cdb=None
[2024-06-21 22:43:27,797] [INFO] [comm.py:625:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
quantization_config: None
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:14<00:00, 3.69s/it]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
['o_proj', 'k_proj', 'q_proj', 'v_proj']
None
trainable params: 54,525,952 || all params: 8,084,787,200 || trainable%: 0.6744265575722265
Loading data...
Formatting inputs...Skip in lazy mode
/home/xylv/anaconda3/envs/llama3_ft/lib/python3.8/site-packages/transformers/optimization.py:429: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set no_deprecation_warning=True
to disable this warning
warnings.warn(
/home/xylv/anaconda3/envs/llama3_ft/lib/python3.8/site-packages/accelerate/accelerator.py:432: FutureWarning: Passing the following arguments to Accelerator
is deprecated and will be removed in version 1.0 of Accelerate: dict_keys(['dispatch_batches', 'split_batches', 'even_batches', 'use_seedable_sampler']). Please pass an accelerate.DataLoaderConfiguration
instead:
dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True)
warnings.warn(
Detected kernel version 3.10.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
Traceback (most recent call last):
File "../finetune_llama3.py", line 434, in optimizers
is not allowed if Deepspeed or PyTorch FSDP is enabled. You should subclass Trainer
and override the create_optimizer_and_scheduler
method.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 56578) of binary: /home/xylv/anaconda3/envs/llama3_ft/bin/python
Traceback (most recent call last):
File "/home/xylv/anaconda3/envs/llama3_ft/bin/torchrun", line 8, in
请问有备选的解决方案吗?非常谢谢,我的邮箱是lyuxiangyue@qq.com,如果有其他解决方案您愿意的话可以邮件给我吗?非常感谢
用zero2的配置。
感谢大佬!更新啦gcc以后成功了!
运行时没有进行模型转换,产生报错