I use qwen2-7b to generate instructions and responses, but error. thanks!

20184490 commented 5 months ago

(magpie) [hadoop-hmart-peisongpa@set-zw04-kubernetes-pc137 magpie-main]$ pip list Package Version

absl-py 2.1.0 accelerate 0.31.0 aiohttp 3.9.5 aiosignal 1.3.1 annotated-types 0.7.0 anthropic 0.28.1 anyio 4.4.0 asttokens 2.4.1 async-timeout 4.0.3 attrs 23.2.0 autoawq 0.2.5 autoawq_kernels 0.0.6 bitsandbytes 0.42.0 boto3 1.34.129 botocore 1.34.129 cachetools 5.3.3 certifi 2024.6.2 charset-normalizer 3.3.2 click 8.1.7 cloudpickle 3.0.0 cmake 3.29.5.1 comm 0.2.2 contextlib2 21.6.0 contourpy 1.2.1 cycler 0.12.1 datasets 2.20.0 debugpy 1.8.1 decorator 5.1.1 dill 0.3.8 diskcache 5.6.3 distro 1.9.0 dnspython 2.6.1 docker-pycreds 0.4.0 docstring_parser 0.16 email_validator 2.1.2 exceptiongroup 1.2.1 executing 2.0.1 faiss-gpu 1.7.2 fastapi 0.111.0 fastapi-cli 0.0.4 fastchat 0.1.0 filelock 3.15.1 fonttools 4.53.0 frozenlist 1.4.1 fsspec 2024.5.0 gitdb 4.0.11 GitPython 3.1.43 google-ai-generativelanguage 0.6.5 google-api-core 2.19.0 google-api-python-client 2.133.0 google-auth 2.30.0 google-auth-httplib2 0.2.0 google-generativeai 0.7.0 googleapis-common-protos 1.63.1 grpcio 1.64.1 grpcio-status 1.62.2 h11 0.14.0 httpcore 1.0.5 httplib2 0.22.0 httptools 0.6.1 httpx 0.27.0 huggingface-hub 0.23.4 idna 3.7 interegular 0.3.3 ipykernel 6.29.4 ipython 8.25.0 ipywidgets 8.1.3 jedi 0.19.1 Jinja2 3.1.4 jiter 0.4.2 jmespath 1.0.1 joblib 1.4.2 jsonschema 4.22.0 jsonschema-specifications 2023.12.1 jupyter_client 8.6.2 jupyter_core 5.7.2 jupyterlab_widgets 3.0.11 kiwisolver 1.4.5 lark 1.1.9 lit 18.1.7 llvmlite 0.43.0 lm-format-enforcer 0.9.8 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.9.0 matplotlib-inline 0.1.7 mdurl 0.1.2 ml_collections 0.1.1 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 multiprocess 0.70.16 nest-asyncio 1.6.0 networkx 3.3 ninja 1.11.1.1 numba 0.60.0 numpy 1.26.4 nvidia-cublas-cu11 11.10.3.66 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu11 8.5.0.96 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu11 10.9.0.58 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu11 10.2.10.91 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu11 11.7.4.91 nvidia-cusparse-cu12 12.1.0.106 nvidia-ml-py 12.555.43 nvidia-nccl-cu11 2.14.3 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.5.40 nvidia-nvtx-cu11 11.7.91 nvidia-nvtx-cu12 12.1.105 openai 1.34.0 orjson 3.10.5 outlines 0.0.34 packaging 24.1 pandas 2.2.2 parso 0.8.4 peft 0.11.1 pexpect 4.9.0 pillow 10.3.0 pip 24.0 platformdirs 4.2.2 prometheus_client 0.20.0 prometheus-fastapi-instrumentator 7.0.0 prompt_toolkit 3.0.47 proto-plus 1.23.0 protobuf 4.25.3 psutil 6.0.0 ptyprocess 0.7.0 pure-eval 0.2.2 py-cpuinfo 9.0.0 pyairports 2.1.1 pyarrow 16.1.0 pyarrow-hotfix 0.6 pyasn1 0.6.0 pyasn1_modules 0.4.0 pycountry 24.6.1 pydantic 2.7.4 pydantic_core 2.18.4 Pygments 2.18.0 pynvml 11.5.0 pyparsing 3.1.2 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-multipart 0.0.9 pytz 2024.1 PyYAML 6.0.1 pyzmq 26.0.3 ray 2.9.0 referencing 0.35.1 regex 2024.5.15 requests 2.32.3 rich 13.7.1 rpds-py 0.18.1 rsa 4.9 s3transfer 0.10.1 safetensors 0.4.3 scikit-learn 1.5.0 scipy 1.13.1 sentence-transformers 3.0.1 sentencepiece 0.2.0 sentry-sdk 2.5.1 setproctitle 1.3.3 setuptools 70.0.0 shellingham 1.5.4 shtab 1.7.1 six 1.16.0 smmap 5.0.1 sniffio 1.3.1 stack-data 0.6.3 starlette 0.37.2 sympy 1.12.1 tenacity 8.4.1 threadpoolctl 3.5.0 tiktoken 0.6.0 tokenizers 0.19.1 torch 2.3.0 torchaudio 2.3.0 torchvision 0.18.0 tornado 6.4.1 tqdm 4.66.4 traitlets 5.14.3 transformers 4.41.2 triton 2.3.0 trl 0.9.4 typer 0.12.3 typing_extensions 4.12.2 tyro 0.8.4 tzdata 2024.1 ujson 5.10.0 uritemplate 4.1.1 urllib3 2.2.2 uvicorn 0.30.1 uvloop 0.19.0 vllm 0.4.2 vllm-flash-attn 2.5.9 vllm_nccl_cu12 2.18.1.0.4.0 wandb 0.17.2 watchfiles 0.22.0 wcwidth 0.2.13 websockets 12.0 wheel 0.43.0 widgetsnbextension 4.0.11 xformers 0.0.26.post1 xxhash 3.4.1 yarl 1.9.4 zstandard 0.22.0

(magpie) [hadoop-hmart-peisongpa@set-zw04-kubernetes-pc137 scripts]$ bash magpie-qwen2-7b.sh [magpie.sh] Model Name: /mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/lijiguo/data/models/Qwen2-7B-Instruct [magpie.sh] Pretty name: Qwen2-7B-Instruct_topp1_temp1_1718955059 [magpie.sh] Total Prompts: 1000 [magpie.sh] Instruction Generation Config: temp=1, top_p=1 [magpie.sh] Response Generation Config: temp=0, top_p=1, rep=1 [magpie.sh] System Config: device=0, n=200, batch_size=200, tensor_parallel=1 [magpie.sh] Timestamp: 1718955059 [magpie.sh] Job Name: Qwen2-7B-Instruct_topp1_temp1_1718955059 [magpie.sh] Start Generating Instructions... Instruction Generation Manager. Arguments: Namespace(model_path='/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/lijiguo/data/models/Qwen2-7B-Instruct', temperature=1.0, top_p=1.0, n=200, repeat=None, total_prompts=1000, max_tokens=2048, max_model_len=4096, early_stopping=True, have_system_prompt=False, shuffle=True, skip_special_tokens=True, checkpoint_every=100, device='0', dtype='bfloat16', tensor_parallel_size=1, gpu_memory_utilization=0.95, swap_space=2.0, output_folder='../data', job_name='Qwen2-7B-Instruct_topp1_temp1_1718955059', timestamp=1718955059, verbose=False, seed=None) INFO 06-21 15:31:05 llm_engine.py:100] Initializing an LLM engine (v0.4.2) with config: model='/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/lijiguo/data/models/Qwen2-7B-Instruct', speculative_config=None, tokenizer='/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/lijiguo/data/models/Qwen2-7B-Instruct', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=4096, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), seed=1718955059, served_model_name=/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/lijiguo/data/models/Qwen2-7B-Instruct) Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. INFO 06-21 15:31:06 utils.py:660] Found nccl from library /home/hadoop-hmart-peisongpa/.config/vllm/nccl/cu12/libnccl.so.2.18.1 Traceback (most recent call last): File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/hanxintong/banma_llm_base_model/storage/magpie-main/scripts/../exp/gen_ins.py", line 84, in llm = LLM(model=args.model_path, File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 123, in init self.llm_engine = LLMEngine.from_engine_args( File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 292, in from_engine_args engine = cls( File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 160, in init self.model_executor = executor_class( File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/vllm/executor/executor_base.py", line 41, in init self._init_executor() File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/vllm/executor/gpu_executor.py", line 23, in _init_executor self._init_non_spec_worker() File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/vllm/executor/gpu_executor.py", line 67, in _init_non_spec_worker self.driver_worker = self._create_worker() File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/vllm/executor/gpu_executor.py", line 59, in _create_worker wrapper.init_worker(*self._get_worker_kwargs(local_rank, rank, File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/vllm/worker/worker_base.py", line 131, in init_worker self.worker = worker_class(args, **kwargs) File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/vllm/worker/worker.py", line 73, in init self.model_runner = ModelRunner( File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 145, in init self.attn_backend = get_attn_backend( File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/vllm/attention/selector.py", line 25, in get_attn_backend backend = _which_attn_to_use(dtype) File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/vllm/attention/selector.py", line 67, in _which_attn_to_use if torch.cuda.get_device_capability()[0] < 8: File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/torch/cuda/init.py", line 430, in get_device_capability prop = get_device_properties(device) File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/torch/cuda/init.py", line 444, in get_device_properties _lazy_init() # will define _get_device_properties File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/tanyunfei/conda/envs/magpie/lib/python3.10/site-packages/torch/cuda/init.py", line 293, in _lazy_init torch._C._cuda_init() RuntimeError: The NVIDIA driver on your system is too old (found version 11080). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. [magpie.sh] Finish Generating Instructions! [magpie.sh] Start Generating Responses... Response Generation Manager. Arguments: Namespace(model_path='/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/lijiguo/data/models/Qwen2-7B-Instruct', input_file='../data/Qwen2-7B-Instruct_topp1_temp1_1718955059/Magpie_Qwen2-7B-Instruct_1000_1718955059_ins.json', batch_size=200, checkpoint_every=20, api=False, api_url='https://api.together.xyz/v1/chat/completions', api_key=None, device='0', dtype='bfloat16', tensor_parallel_size=1, gpu_memory_utilization=0.95, max_tokens=4096, max_model_len=4096, temperature=0.0, top_p=1.0, repetition_penalty=1.0) Traceback (most recent call last): File "/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/hanxintong/banma_llm_base_model/storage/magpie-main/scripts/../exp/gen_res.py", line 59, in model_config = model_configs[args.model_path] KeyError: '/mnt/dolphinfs/hdd_pool/docker/user/hadoop-hmart-peisongpa/lijiguo/data/models/Qwen2-7B-Instruct'

zhangchen-xu commented 5 months ago

Hi, It seems that you need to upgrade your CUDA to 12.x. Hope this helps!

20184490 commented 5 months ago

thanks! but my cuda max to 11.8 ,so maybe i cannot update to 12.X. does vllm=0.4.0 work?

zhangchen-xu commented 5 months ago

I remembered Qwen series used the same QwenForCausalLM class. Vllm 0.4.0 should be able to work. No harm to try!

magpie-align / magpie

I use qwen2-7b to generate instructions and responses, but error. thanks! #6