transformers-4.42.3 无法运行basic demo，4.41.3可以正常运行basic demo

System Info / 系統信息

accelerate 0.32.1 aiofiles 23.2.1 annotated-types 0.7.0 anyio 3.7.1 asyncer 0.0.2 av 12.2.0 bidict 0.23.1 bitsandbytes 0.43.1 carla 0.9.12 certifi 2024.7.4 chainlit 1.1.306 charset-normalizer 3.3.2 chevron 0.14.0 click 8.1.7 dataclasses-json 0.5.14 decord 0.6.0 Deprecated 1.2.14 distro 1.9.0 einops 0.8.0 exceptiongroup 1.2.1 fastapi 0.110.3 filelock 3.15.4 filetype 1.2.0 fsspec 2024.6.1 fvcore 0.1.5.post20221221 googleapis-common-protos 1.63.2 grpcio 1.64.1 h11 0.14.0 httpcore 1.0.5 httpx 0.27.0 huggingface-hub 0.23.4 idna 3.7 importlib_metadata 7.1.0 iopath 0.1.10 Jinja2 3.1.4 Lazify 0.4.0 literalai 0.0.607 loguru 0.7.2 MarkupSafe 2.1.5 marshmallow 3.21.3 mpmath 1.3.0 mypy-extensions 1.0.0 nest-asyncio 1.6.0 networkx 3.3 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.5.82 nvidia-nvtx-cu12 12.1.105 openai 1.35.10 opentelemetry-api 1.25.0 opentelemetry-exporter-otlp 1.25.0 opentelemetry-exporter-otlp-proto-common 1.25.0 opentelemetry-exporter-otlp-proto-grpc 1.25.0 opentelemetry-exporter-otlp-proto-http 1.25.0 opentelemetry-instrumentation 0.46b0 opentelemetry-proto 1.25.0 opentelemetry-sdk 1.25.0 opentelemetry-semantic-conventions 0.46b0 packaging 23.2 parameterized 0.9.0 pillow 10.4.0 pip 24.0 portalocker 2.10.0 protobuf 4.25.3 psutil 6.0.0 pydantic 2.8.2 pydantic_core 2.20.1 PyJWT 2.8.0 python-dotenv 1.0.1 python-engineio 4.9.1 python-multipart 0.0.9 python-socketio 5.11.3 pytorchvideo 0.1.5 PyYAML 6.0.1 regex 2024.5.15 requests 2.32.3 safetensors 0.4.3 setuptools 69.5.1 simple-websocket 1.0.0 sniffio 1.3.1 sse-starlette 2.1.2 starlette 0.37.2 sympy 1.12.1 syncer 2.0.3 tabulate 0.9.0 termcolor 2.4.0 timm 1.0.7 tokenizers 0.19.1 tomli 2.0.1 torch 2.3.1+cu121 torchaudio 2.3.1 torchvision 0.18.1 tqdm 4.66.4 transformers 4.42.3 triton 2.3.1 typing_extensions 4.12.2 typing-inspect 0.9.0 uptrace 1.24.0 urllib3 2.2.2 uvicorn 0.25.0 watchfiles 0.20.0 wheel 0.43.0 wrapt 1.16.0 wsproto 1.2.0 xformers 0.0.27 yacs 0.1.8 zipp 3.19.2 (CogVLM2-video) ➜ ~/Workspace/LargeLanguageModelProject/LLMprojects/CogVLM2-video/CogVLM2/basic_demo git:(main) ✗ pip uninstall transformers Found existing installation: transformers 4.42.3 Uninstalling transformers-4.42.3: Would remove: /home/sumail/anaconda3/envs/CogVLM2-video/bin/transformers-cli /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages/transformers-4.42.3.dist-info/ /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages/transformers/ Proceed (Y/n)? y Successfully uninstalled transformers-4.42.3 (CogVLM2-video) ➜ ~/Workspace/LargeLanguageModelProject/LLMprojects/CogVLM2-video/CogVLM2/basic_demo git:(main) ✗ pip install transformers==4.41.1 Collecting transformers==4.41.1 Using cached transformers-4.41.1-py3-none-any.whl.metadata (43 kB) Requirement already satisfied: filelock in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from transformers==4.41.1) (3.15.4) Requirement already satisfied: huggingface-hub<1.0,>=0.23.0 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from transformers==4.41.1) (0.23.4) Requirement already satisfied: numpy>=1.17 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from transformers==4.41.1) (1.26.4) Requirement already satisfied: packaging>=20.0 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from transformers==4.41.1) (23.2) Requirement already satisfied: pyyaml>=5.1 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from transformers==4.41.1) (6.0.1) Requirement already satisfied: regex!=2019.12.17 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from transformers==4.41.1) (2024.5.15) Requirement already satisfied: requests in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from transformers==4.41.1) (2.32.3) Requirement already satisfied: tokenizers<0.20,>=0.19 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from transformers==4.41.1) (0.19.1) Requirement already satisfied: safetensors>=0.4.1 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from transformers==4.41.1) (0.4.3) Requirement already satisfied: tqdm>=4.27 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from transformers==4.41.1) (4.66.4) Requirement already satisfied: fsspec>=2023.5.0 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers==4.41.1) (2024.6.1) Requirement already satisfied: typing-extensions>=3.7.4.3 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from huggingface-hub<1.0,>=0.23.0->transformers==4.41.1) (4.12.2) Requirement already satisfied: charset-normalizer<4,>=2 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from requests->transformers==4.41.1) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from requests->transformers==4.41.1) (3.7) Requirement already satisfied: urllib3<3,>=1.21.1 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from requests->transformers==4.41.1) (2.2.2) Requirement already satisfied: certifi>=2017.4.17 in /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages (from requests->transformers==4.41.1) (2024.7.4) Using cached transformers-4.41.1-py3-none-any.whl (9.1 MB) Installing collected packages: transformers Successfully installed transformers-4.41.1 (CogVLM2-video) ➜ ~/Workspace/LargeLanguageModelProject/LLMprojects/CogVLM2-video/CogVLM2/basic_demo git:(main) ✗ CUDA_VISIBLE_DEVICES=0 python cli_demo.py --quant 4 Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:04<00:00, 1.96it/s] image path >>>>> /home/sumail/图片/截图/12.png Human:describe this picture

CogVLM2: This image appears to be a table with information about different models, likely in the context of machine learning or artificial intelligence. Each row represents a different model, and the columns contain various metrics related to the model's performance. Here's a step-by-step description of the image:

MODEL: This column lists the names of the models. For example, we have "gpt-4", "gpt-4-1106-preview', "gpt-4-vision-preview', etc.
RPM: This column likely stands for "Request Per Minute". It indicates the number of requests or queries that the model can handle per minute. For instance, "gpt-4" has an RPM of 500, meaning it can handle 500 requests per minute.
RPD: This column stands for "Response Per Day". It shows the number of responses the model can generate per day. For "gpt-4", the RPD is 10,000, which means it can generate 10,000 responses in a day.
TPM: This column is likely "Transactions Per Minute". It indicates the number of transactions or operations the model can handle per minute. For "gpt-4", the TPM is 10,000, which means it can handle 10,000 transactions per minute.
TPD: This column is "Transactions Per Day". It shows the number of transactions the model can handle per day. For "gpt-4", the TPD is 500,000, which means it can handle 500,000 transactions per day.
The models listed include variants of the GPT-4 language model, which is a type of AI developed by OpenAI. There are also other models like "whisper-1", "tts-1", "tts-1-hd", "dall-e-2", and "dall-e-3", which seem to be different types of AI models, possibly for tasks such as speech synthesis (tts stands for "text-to-speech"), image processing (dall-e stands for "DALL-E", which is likely a model for image recognition), and other tasks.

The image is presented in a tabular format, which is a common way to organize and present data in a structured manner. The exact context or application of these models is not provided in the image, but it is clear that they are meant for high-throughput tasks, likely serving in areas such as customer service chatbots, voice assistants, image processing, and other AI-driven applications where large volumes of data need to be processed quickly. Human:^CTraceback (most recent call last): File "/home/sumail/Workspace/LargeLanguageModelProject/LLMprojects/CogVLM2-video/CogVLM2/basic_demo/cli_demo.py", line 71, in query = input("Human:") KeyboardInterrupt (CogVLM2-video) ➜ ~/Workspace/LargeLanguageModelProject/LLMprojects/CogVLM2-video/CogVLM2/basic_demo git:(main) ✗ .. (CogVLM2-video) ➜ ~/Workspace/LargeLanguageModelProject/LLMprojects/CogVLM2-video/CogVLM2 git:(main) ✗ cd video_demo
(CogVLM2-video) ➜ ~/Workspace/LargeLanguageModelProject/LLMprojects/CogVLM2-video/CogVLM2/video_demo git:(main) ✗ CUDA_VISIBLE_DEVICES=0 python cli_demo.py --quant 4 python: can't open file '/home/sumail/Workspace/LargeLanguageModelProject/LLMprojects/CogVLM2-video/CogVLM2/video_demo/cli_demo.py': [Errno 2] No such file or directory (CogVLM2-video) ➜ ~/Workspace/LargeLanguageModelProject/LLMprojects/CogVLM2-video/CogVLM2/video_demo git:(main) ✗ CUDA_VISIBLE_DEVICES=0 python cli_video_demo.py --quant 4 Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages/torchvision/transforms/_functional_video.py:6: UserWarning: The 'torchvision.transforms._functional_video' module is deprecated since 0.12 and will be removed in the future. Please use the 'torchvision.transforms.functional' module instead. warnings.warn( /home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages/torchvision/transforms/_transforms_video.py:22: UserWarning: The 'torchvision.transforms._transforms_video' module is deprecated since 0.12 and will be removed in the future. Please use the 'torchvision.transforms' module instead. warnings.warn( Traceback (most recent call last): File "/home/sumail/Workspace/LargeLanguageModelProject/LLMprojects/CogVLM2-video/CogVLM2/video_demo/cli_video_demo.py", line 71, in model = AutoModelForCausalLM.from_pretrained( File "/home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 550, in from_pretrained model_class = get_class_from_dynamic_module( File "/home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 510, in get_class_from_dynamic_module return get_class_in_module(class_name, final_module) File "/home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 208, in get_class_in_module module_spec.loader.exec_module(module) File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "/home/sumail/.cache/huggingface/modules/transformers_modules/modeling_cogvlm.py", line 24, in from pytorchvideo.transforms import ApplyTransformToKey, ShortSideScale File "/home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages/pytorchvideo/transforms/init.py", line 3, in from .augmix import AugMix # noqa File "/home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages/pytorchvideo/transforms/augmix.py", line 6, in from pytorchvideo.transforms.augmentations import ( File "/home/sumail/anaconda3/envs/CogVLM2-video/lib/python3.10/site-packages/pytorchvideo/transforms/augmentations.py", line 9, in import torchvision.transforms.functional_tensor as F_t ModuleNotFoundError: No module named 'torchvision.transforms.functional_tensor' (CogVLM2-video) ➜ ~/Workspace/LargeLanguageModelProject/LLMprojects/CogVLM2-video/CogVLM2/video_demo git:(main) ✗ pip list Package Version

Who can help? / 谁可以帮助到您？

No response

Information / 问题信息

[X] The official example scripts / 官方的示例脚本
[ ] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

1.cd basic_demo

CUDA_VISIBLE_DEVICES=0 python cli_demo.py --quant 4

Expected behavior / 期待表现

transformers-4.42.3 无法运行basic demo，4.41.3可以正常运行basic demo

THUDM / CogVLM2