LittleXu1998 commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

[2023-07-06 02:48:12,405] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2023-07-06 02:48:13,388] [WARNING] [runner.py:196:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only. [2023-07-06 02:48:13,736] [INFO] [runner.py:555:main] cmd = /usr/local/miniconda3/envs/chatglm6/bin/python -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbMCwgMV19 --master_addr=127.0.0.1 --master_port=43946 --enable_each_rank_log=None main.py --deepspeed deepspeed.json --do_train --train_file AdvertiseGen/train.json --test_file AdvertiseGen/dev.json --prompt_column content --response_column summary --overwrite_cache --model_name_or_path /hy-tmp/ChatGLM2-6B/chatglm2-6b --output_dir ./output/adgen-chatglm2-6b-ft-1e-4 --overwrite_output_dir --max_source_length 64 --max_target_length 64 --per_device_train_batch_size 4 --per_device_eval_batch_size 1 --gradient_accumulation_steps 1 --predict_with_generate --max_steps 5000 --logging_steps 10 --save_steps 1000 --learning_rate 1e-4 --fp16 [2023-07-06 02:48:15,154] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2023-07-06 02:48:16,212] [INFO] [launch.py:145:main] WORLD INFO DICT: {'localhost': [0, 1]} [2023-07-06 02:48:16,212] [INFO] [launch.py:151:main] nnodes=1, num_local_procs=2, node_rank=0 [2023-07-06 02:48:16,212] [INFO] [launch.py:162:main] global_rank_mapping=defaultdict(<class 'list'>, {'localhost': [0, 1]}) [2023-07-06 02:48:16,212] [INFO] [launch.py:163:main] dist_world_size=2 [2023-07-06 02:48:16,212] [INFO] [launch.py:165:main] Setting CUDA_VISIBLE_DEVICES=0,1 [2023-07-06 02:48:19,010] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2023-07-06 02:48:19,051] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2023-07-06 02:48:19,644] [WARNING] [comm.py:152:init_deepspeed_backend] NCCL backend in DeepSpeed not yet implemented [2023-07-06 02:48:19,644] [INFO] [comm.py:594:init_distributed] cdb=None [2023-07-06 02:48:19,644] [INFO] [comm.py:625:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl [2023-07-06 02:48:19,741] [WARNING] [comm.py:152:init_deepspeed_backend] NCCL backend in DeepSpeed not yet implemented [2023-07-06 02:48:19,741] [INFO] [comm.py:594:init_distributed] cdb=None 07/06/2023 02:48:19 - WARNING - main - Process rank: 1, device: cuda:1, n_gpu: 1distributed training: True, 16-bits training: True 07/06/2023 02:48:19 - WARNING - main - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: True 07/06/2023 02:48:19 - INFO - main - Training/evaluation parameters Seq2SeqTrainingArguments( _n_gpu=1, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_backend=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=deepspeed.json, disable_tqdm=False, do_eval=False, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=None, evaluation_strategy=no, fp16=True, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'fsdp_min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, generation_config=None, generation_max_length=None, generation_num_beams=None, gradient_accumulation_steps=1, gradient_checkpointing=False, greater_is_better=None, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=0.0001, length_column_name=length, load_best_model_at_end=False, local_rank=0, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=./output/adgen-chatglm2-6b-ft-1e-4/runs/Jul06_02-48-19_I139d792b0a01101e79, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=10, logging_strategy=steps, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=5000, metric_for_best_model=None, mp_parameters=, no_cuda=False, num_train_epochs=3.0, optim=adamw_hf, optim_args=None, output_dir=./output/adgen-chatglm2-6b-ft-1e-4, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=1, per_device_train_batch_size=4, predict_with_generate=True, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['wandb'], resume_from_checkpoint=None, run_name=./output/adgen-chatglm2-6b-ft-1e-4, save_on_each_node=False, save_safetensors=False, save_steps=1000, save_strategy=steps, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, sortish_sampler=False, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_num_cores=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=0, weight_decay=0.0, xpu_backend=None, ) 07/06/2023 02:48:20 - WARNING - datasets.builder - Found cached dataset json (/root/.cache/huggingface/datasets/json/default-513a9364703395e0/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51) 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 537.63it/s] 07/06/2023 02:48:21 - WARNING - datasets.builder - Found cached dataset json (/root/.cache/huggingface/datasets/json/default-513a9364703395e0/0.0.0/0f7e3662623656454fcd2b650f34e886a7db4b9104504885bd462096cc7a9f51) 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 630.20it/s] [INFO|configuration_utils.py:667] 2023-07-06 02:48:21,073 >> loading configuration file /hy-tmp/ChatGLM2-6B/chatglm2-6b/config.json [INFO|configuration_utils.py:667] 2023-07-06 02:48:21,074 >> loading configuration file /hy-tmp/ChatGLM2-6B/chatglm2-6b/config.json [INFO|configuration_utils.py:725] 2023-07-06 02:48:21,075 >> Model config ChatGLMConfig { "_name_or_path": "/hy-tmp/ChatGLM2-6B/chatglm2-6b", "add_bias_linear": false, "add_qkv_bias": true, "apply_query_key_layer_scaling": true, "apply_residual_connection_post_layernorm": false, "architectures": [ "ChatGLMModel" ], "attention_dropout": 0.0, "attention_softmax_in_fp32": true, "auto_map": { "AutoConfig": "configuration_chatglm.ChatGLMConfig", "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration", "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration" }, "bias_dropout_fusion": true, "eos_token_id": 2, "ffn_hidden_size": 13696, "fp32_residual_connection": false, "hidden_dropout": 0.0, "hidden_size": 4096, "kv_channels": 128, "layernorm_epsilon": 1e-05, "model_type": "chatglm", "multi_query_attention": true, "multi_query_group_num": 2, "num_attention_heads": 32, "num_layers": 28, "original_rope": true, "pad_token_id": 0, "padded_vocab_size": 65024, "post_layer_norm": true, "pre_seq_len": null, "prefix_projection": false, "quantization_bit": 0, "rmsnorm": true, "seq_length": 32768, "tie_word_embeddings": false, "torch_dtype": "float16", "transformers_version": "4.30.2", "use_cache": true, "vocab_size": 65024 }

[INFO|tokenization_utils_base.py:1821] 2023-07-06 02:48:21,077 >> loading file tokenizer.model [INFO|tokenization_utils_base.py:1821] 2023-07-06 02:48:21,077 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:1821] 2023-07-06 02:48:21,077 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:1821] 2023-07-06 02:48:21,077 >> loading file tokenizer_config.json Loading checkpoint shards: 0%| | 0/7 [00:00<?, ?it/s][INFO|modeling_utils.py:2575] 2023-07-06 02:48:21,171 >> loading weights file /hy-tmp/ChatGLM2-6B/chatglm2-6b/pytorch_model.bin.index.json [INFO|configuration_utils.py:577] 2023-07-06 02:48:21,172 >> Generate config GenerationConfig { "_from_model_config": true, "eos_token_id": 2, "pad_token_id": 0, "transformers_version": "4.30.2" }

Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:12<00:00, 1.72s/it] [INFO|modeling_utils.py:3295] 2023-07-06 02:48:33,321 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.

[INFO|modeling_utils.py:3303] 2023-07-06 02:48:33,321 >> All the weights of ChatGLMForConditionalGeneration were initialized from the model checkpoint at /hy-tmp/ChatGLM2-6B/chatglm2-6b. If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMForConditionalGeneration for predictions without further training. [INFO|modeling_utils.py:2927] 2023-07-06 02:48:33,324 >> Generation config file not found, using a generation config created from the model config. Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:12<00:00, 1.85s/it] input_ids [64790, 64792, 790, 30951, 517, 30910, 30939, 30996, 13, 13, 54761, 31211, 33467, 31010, 56532, 30998, 55090, 54888, 31010, 40833, 30998, 32799, 31010, 40589, 30998, 37505, 31010, 37216, 30998, 56532, 54888, 31010, 56529, 56158, 56532, 13, 13, 55437, 31211, 30910, 40833, 54530, 56529, 56158, 56532, 54551, 33808, 32041, 55360, 55486, 32138, 31123, 32943, 33481, 54880, 31664, 46565, 54799, 31155, 33051, 54591, 55432, 33481, 31123, 55622, 32904, 55432, 54557, 56158, 54625, 30943, 55055, 35590, 40833, 54530, 56532, 56158, 31123, 48466, 57148, 55343, 54603, 49355, 55674, 31155, 51605, 55119, 54642, 31799, 54535, 57036, 55625, 31123, 46839, 55113, 56089, 33894, 55778, 31902, 55017, 54706, 56382, 56382, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] inputs [Round 1]

问：类型#裤版型#宽松风格#性感图案#线条裤型#阔腿裤

答：宽松的阔腿裤这两年真的吸粉不少，明星时尚达人的心头爱。毕竟好穿时尚，谁都能穿出腿长2米的效果宽松的裤腿，当然是遮肉小能手啊。上身随性自然不拘束，面料亲肤舒适贴身体验感棒棒 label_ids [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 30910, 40833, 54530, 56529, 56158, 56532, 54551, 33808, 32041, 55360, 55486, 32138, 31123, 32943, 33481, 54880, 31664, 46565, 54799, 31155, 33051, 54591, 55432, 33481, 31123, 55622, 32904, 55432, 54557, 56158, 54625, 30943, 55055, 35590, 40833, 54530, 56532, 56158, 31123, 48466, 57148, 55343, 54603, 49355, 55674, 31155, 51605, 55119, 54642, 31799, 54535, 57036, 55625, 31123, 46839, 55113, 56089, 33894, 55778, 31902, 55017, 54706, 56382, 56382, 2, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100] labels 宽松的阔腿裤这两年真的吸粉不少，明星时尚达人的心头爱。毕竟好穿时尚，谁都能穿出腿长2米的效果宽松的裤腿，当然是遮肉小能手啊。上身随性自然不拘束，面料亲肤舒适贴身体验感棒棒 Running tokenizer on train dataset: 0%| | 0/114599 [00:00<?, ? examples/s]╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /hy-tmp/ChatGLM2-6B/ptuning/main.py:425 in │ │ │ │ 422 │ │ 423 │ │ 424 if name == "main": │ │ ❱ 425 │ main() │ │ 426 │ │ │ │ /hy-tmp/ChatGLM2-6B/ptuning/main.py:364 in main │ │ │ │ 361 │ │ # checkpoint = last_checkpoint │ │ 362 │ │ model.gradient_checkpointing_enable() │ │ 363 │ │ model.enable_input_require_grads() │ │ ❱ 364 │ │ train_result = trainer.train(resume_from_checkpoint=checkpoint) │ │ 365 │ │ # trainer.save_model() # Saves the tokenizer too for easy upload │ │ 366 │ │ │ │ 367 │ │ metrics = train_result.metrics │ │ │ │ /hy-tmp/ChatGLM2-6B/ptuning/trainer.py:1635 in train │ │ │ │ 1632 │ │ inner_training_loop = find_executable_batch_size( │ │ 1633 │ │ │ self._inner_training_loop, self._train_batch_size, args.auto_find_batch_size │ │ 1634 │ │ ) │ │ ❱ 1635 │ │ return inner_training_loop( │ │ 1636 │ │ │ args=args, │ │ 1637 │ │ │ resume_from_checkpoint=resume_from_checkpoint, │ │ 1638 │ │ │ trial=trial, │ │ │ │ /hy-tmp/ChatGLM2-6B/ptuning/trainer.py:1704 in _inner_training_loop │ │ │ │ 1701 │ │ │ or self.fsdp is not None │ │ 1702 │ │ ) │ │ 1703 │ │ if args.deepspeed: │ │ ❱ 1704 │ │ │ deepspeed_engine, optimizer, lr_scheduler = deepspeed_init( │ │ 1705 │ │ │ │ self, num_training_steps=max_steps, resume_from_checkpoint=resume_from_c │ │ 1706 │ │ │ ) │ │ 1707 │ │ │ self.model = deepspeed_engine.module │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ TypeError: deepspeed_init() got an unexpected keyword argument 'resume_from_checkpoint' Running tokenizer on train dataset: 4%|██▉ | 5000/114599 [00:02<00:59, 1855.19 examples/s][2023-07-06 02:49:50,391] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 10116 [2023-07-06 02:49:50,391] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 10117 [2023-07-06 02:49:52,063] [ERROR] [launch.py:321:sigkill_handler] ['/usr/local/miniconda3/envs/chatglm6/bin/python', '-u', 'main.py', '--local_rank=1', '--deepspeed', 'deepspeed.json', '--do_train', '--train_file', 'AdvertiseGen/train.json', '--test_file', 'AdvertiseGen/dev.json', '--prompt_column', 'content', '--response_column', 'summary', '--overwrite_cache', '--model_name_or_path', '/hy-tmp/ChatGLM2-6B/chatglm2-6b', '--output_dir', './output/adgen-chatglm2-6b-ft-1e-4', '--overwrite_output_dir', '--max_source_length', '64', '--max_target_length', '64', '--per_device_train_batch_size', '4', '--per_device_eval_batch_size', '1', '--gradient_accumulation_steps', '1', '--predict_with_generate', '--max_steps', '5000', '--logging_steps', '10', '--save_steps', '1000', '--learning_rate', '1e-4', '--fp16'] exits with return code = 1

Expected Behavior

No response

Steps To Reproduce

packages in environment at /usr/local/miniconda3/envs/chatglm6:

#

Name Version Build Channel

_libgcc_mutex 0.1 main defaults _openmp_mutex 5.1 1_gnu defaults accelerate 0.20.2 pypi_0 pypi aiofiles 23.1.0 pypi_0 pypi aiohttp 3.8.4 pypi_0 pypi aiosignal 1.3.1 pypi_0 pypi altair 5.0.0 pypi_0 pypi anyio 3.6.2 pypi_0 pypi appdirs 1.4.4 pypi_0 pypi argon2-cffi 21.3.0 pypi_0 pypi argon2-cffi-bindings 21.2.0 pypi_0 pypi arrow 1.2.3 pypi_0 pypi asttokens 2.2.1 pypi_0 pypi async-timeout 4.0.2 pypi_0 pypi attrs 23.1.0 pypi_0 pypi backcall 0.2.0 pypi_0 pypi beautifulsoup4 4.12.2 pypi_0 pypi bitsandbytes 0.39.0 pypi_0 pypi bleach 6.0.0 pypi_0 pypi bzip2 1.0.8 h7b6447c_0 defaults ca-certificates 2023.01.10 h06a4308_0 defaults certifi 2022.12.7 pypi_0 pypi cffi 1.15.1 pypi_0 pypi charset-normalizer 2.1.1 pypi_0 pypi click 8.1.3 pypi_0 pypi cmake 3.25.0 pypi_0 pypi comm 0.1.3 pypi_0 pypi contourpy 1.0.7 pypi_0 pypi cpm-kernels 1.0.11 pypi_0 pypi cycler 0.11.0 pypi_0 pypi datasets 2.10.1 pypi_0 pypi debugpy 1.6.7 pypi_0 pypi decorator 5.1.1 pypi_0 pypi deepspeed 0.9.5 pypi_0 pypi defusedxml 0.7.1 pypi_0 pypi dill 0.3.6 pypi_0 pypi docker-pycreds 0.4.0 pypi_0 pypi et-xmlfile 1.1.0 pypi_0 pypi executing 1.2.0 pypi_0 pypi fastapi 0.95.1 pypi_0 pypi fastjsonschema 2.16.3 pypi_0 pypi ffmpy 0.3.0 pypi_0 pypi filelock 3.9.0 pypi_0 pypi fonttools 4.39.4 pypi_0 pypi fqdn 1.5.1 pypi_0 pypi frozenlist 1.3.3 pypi_0 pypi fsspec 2023.5.0 pypi_0 pypi gitdb 4.0.10 pypi_0 pypi gitpython 3.1.31 pypi_0 pypi gradio 3.30.0 pypi_0 pypi gradio-client 0.2.4 pypi_0 pypi h11 0.14.0 pypi_0 pypi hjson 3.1.0 pypi_0 pypi httpcore 0.17.0 pypi_0 pypi httpx 0.24.0 pypi_0 pypi huggingface-hub 0.14.1 pypi_0 pypi idna 3.4 pypi_0 pypi ipykernel 6.23.1 pypi_0 pypi ipython 8.13.2 pypi_0 pypi ipython-genutils 0.2.0 pypi_0 pypi ipywidgets 8.0.6 pypi_0 pypi isoduration 20.11.0 pypi_0 pypi jedi 0.18.2 pypi_0 pypi jieba 0.42.1 pypi_0 pypi jinja2 3.1.2 pypi_0 pypi joblib 1.2.0 pypi_0 pypi jsonpointer 2.3 pypi_0 pypi jsonschema 4.17.3 pypi_0 pypi jupyter 1.0.0 pypi_0 pypi jupyter-client 8.2.0 pypi_0 pypi jupyter-console 6.6.3 pypi_0 pypi jupyter-core 5.3.0 pypi_0 pypi jupyter-events 0.6.3 pypi_0 pypi jupyter-server 2.5.0 pypi_0 pypi jupyter-server-terminals 0.4.4 pypi_0 pypi jupyterlab-pygments 0.2.2 pypi_0 pypi jupyterlab-widgets 3.0.7 pypi_0 pypi kiwisolver 1.4.4 pypi_0 pypi latex2mathml 3.75.5 pypi_0 pypi ld_impl_linux-64 2.38 h1181459_1 defaults libffi 3.4.4 h6a678d5_0 defaults libgcc-ng 11.2.0 h1234567_1 defaults libgomp 11.2.0 h1234567_1 defaults libstdcxx-ng 11.2.0 h1234567_1 defaults libuuid 1.41.5 h5eee18b_0 defaults linkify-it-py 2.0.2 pypi_0 pypi lit 15.0.7 pypi_0 pypi loralib 0.1.1 pypi_0 pypi markdown 3.4.3 pypi_0 pypi markdown-it-py 2.2.0 pypi_0 pypi markdown2 2.4.8 pypi_0 pypi markupsafe 2.1.2 pypi_0 pypi matplotlib 3.7.1 pypi_0 pypi matplotlib-inline 0.1.6 pypi_0 pypi mdit-py-plugins 0.3.3 pypi_0 pypi mdtex2html 1.2.0 pypi_0 pypi mdurl 0.1.2 pypi_0 pypi mistune 2.0.5 pypi_0 pypi mpmath 1.2.1 pypi_0 pypi multidict 6.0.4 pypi_0 pypi multiprocess 0.70.14 pypi_0 pypi nbclassic 1.0.0 pypi_0 pypi nbclient 0.7.4 pypi_0 pypi nbconvert 7.4.0 pypi_0 pypi nbformat 5.8.0 pypi_0 pypi ncurses 6.4 h6a678d5_0 defaults nest-asyncio 1.5.6 pypi_0 pypi networkx 3.0 pypi_0 pypi ninja 1.11.1 pypi_0 pypi nltk 3.8.1 pypi_0 pypi notebook 6.5.4 pypi_0 pypi notebook-shim 0.2.3 pypi_0 pypi numpy 1.24.1 pypi_0 pypi openpyxl 3.1.2 pypi_0 pypi openssl 1.1.1t h7f8727e_0 defaults orjson 3.8.12 pypi_0 pypi packaging 23.1 pypi_0 pypi pandas 2.0.1 pypi_0 pypi pandocfilters 1.5.0 pypi_0 pypi parso 0.8.3 pypi_0 pypi pathtools 0.1.2 pypi_0 pypi peft 0.3.0 pypi_0 pypi pexpect 4.8.0 pypi_0 pypi pickleshare 0.7.5 pypi_0 pypi pillow 9.3.0 pypi_0 pypi pip 23.0.1 py310h06a4308_0 defaults platformdirs 3.5.1 pypi_0 pypi prometheus-client 0.16.0 pypi_0 pypi prompt-toolkit 3.0.38 pypi_0 pypi protobuf 4.23.0 pypi_0 pypi psutil 5.9.5 pypi_0 pypi ptyprocess 0.7.0 pypi_0 pypi pure-eval 0.2.2 pypi_0 pypi py-cpuinfo 9.0.0 pypi_0 pypi pyarrow 12.0.1 pypi_0 pypi pycparser 2.21 pypi_0 pypi pydantic 1.10.7 pypi_0 pypi pydub 0.25.1 pypi_0 pypi pygments 2.15.1 pypi_0 pypi pyparsing 3.0.9 pypi_0 pypi pyrsistent 0.19.3 pypi_0 pypi pysocks 1.7.1 pypi_0 pypi python 3.10.11 h7a1cb2a_2 defaults python-dateutil 2.8.2 pypi_0 pypi python-json-logger 2.0.7 pypi_0 pypi python-multipart 0.0.6 pypi_0 pypi pytz 2023.3 pypi_0 pypi pyyaml 6.0 pypi_0 pypi pyzmq 25.0.2 pypi_0 pypi qtconsole 5.4.3 pypi_0 pypi qtpy 2.3.1 pypi_0 pypi readline 8.2 h5eee18b_0 defaults regex 2023.5.5 pypi_0 pypi requests 2.28.1 pypi_0 pypi responses 0.18.0 pypi_0 pypi rfc3339-validator 0.1.4 pypi_0 pypi rfc3986-validator 0.1.1 pypi_0 pypi rich 13.4.2 pypi_0 pypi rouge-chinese 1.0.3 pypi_0 pypi safetensors 0.3.1 pypi_0 pypi scikit-learn 1.2.2 pypi_0 pypi scipy 1.10.1 pypi_0 pypi semantic-version 2.10.0 pypi_0 pypi send2trash 1.8.2 pypi_0 pypi sentencepiece 0.1.99 pypi_0 pypi sentry-sdk 1.25.1 pypi_0 pypi setproctitle 1.3.2 pypi_0 pypi setuptools 67.8.0 py310h06a4308_0 defaults six 1.16.0 pypi_0 pypi smmap 5.0.0 pypi_0 pypi sniffio 1.3.0 pypi_0 pypi soupsieve 2.4.1 pypi_0 pypi sqlite 3.41.2 h5eee18b_0 defaults stack-data 0.6.2 pypi_0 pypi starlette 0.26.1 pypi_0 pypi svgwrite 1.4.3 pypi_0 pypi sympy 1.11.1 pypi_0 pypi terminado 0.17.1 pypi_0 pypi threadpoolctl 3.1.0 pypi_0 pypi tinycss2 1.2.1 pypi_0 pypi tk 8.6.12 h1ccaba5_0 defaults tokenizers 0.13.3 pypi_0 pypi toolz 0.12.0 pypi_0 pypi torch 2.0.1+cu118 pypi_0 pypi torchaudio 2.0.2+cu118 pypi_0 pypi torchvision 0.15.2+cu118 pypi_0 pypi tornado 6.3.1 pypi_0 pypi tqdm 4.65.0 pypi_0 pypi traitlets 5.9.0 pypi_0 pypi transformers 4.30.2 pypi_0 pypi triton 2.0.0 pypi_0 pypi typing-extensions 4.4.0 pypi_0 pypi tzdata 2023.3 pypi_0 pypi uc-micro-py 1.0.2 pypi_0 pypi uri-template 1.2.0 pypi_0 pypi urllib3 1.26.13 pypi_0 pypi uvicorn 0.22.0 pypi_0 pypi wandb 0.15.4 pypi_0 pypi wavedrom 2.0.3.post3 pypi_0 pypi wcwidth 0.2.6 pypi_0 pypi webcolors 1.13 pypi_0 pypi webencodings 0.5.1 pypi_0 pypi websocket-client 1.5.1 pypi_0 pypi websockets 11.0.3 pypi_0 pypi wheel 0.38.4 py310h06a4308_0 defaults widgetsnbextension 4.0.7 pypi_0 pypi xxhash 3.2.0 pypi_0 pypi xz 5.4.2 h5eee18b_0 defaults yarl 1.9.2 pypi_0 pypi zlib 1.2.13 h5eee18b_0 defaults

Environment

- OS:Ubuntu 18.04
- Python:3.9
- Transformers:4.30.2
- PyTorch:2.0.2+cu118
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True

Anything else?

No response

LaniakeaS commented 1 year ago

same issue

shiyanlou-015555 commented 1 year ago

please use transformers<=4.29, such as 4.28.1, then you will success