hpcaitech / EnergonAI

Large-scale model inference.
Apache License 2.0
631 stars 90 forks source link

an error caused by running the example of the opt #206

Open LemonSqi opened 1 year ago

LemonSqi commented 1 year ago

(pytorch) root@USER-20211001RA:~/EnergonAI-main/examples/opt# python opt_fastapi.py opt-125m --checkpoint ./restored.pt /usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/torch/library.py:130: UserWarning: Overriding a previously registered kernel for the same operator and the same dispatch key operator: aten::index.Tensor(Tensor self, Tensor?[] indices) -> Tensor registered at /opt/conda/conda-bld/pytorch_1670525539683/work/build/aten/src/ATen/RegisterSchema.cpp:6 dispatch key: Meta previous kernel: registered at /opt/conda/conda-bld/pytorch_1670525539683/work/aten/src/ATen/functorch/BatchRulesScatterOps.cpp:1053 new kernel: registered at /dev/null:228 (Triggered internally at /opt/conda/conda-bld/pytorch_1670525539683/work/aten/src/ATen/core/dispatch/OperatorEntry.cpp:150.) self.m.impl(name, dispatch_key, fn) Traceback (most recent call last): File "/root/EnergonAI-main/examples/opt/opt_fastapi.py", line 7, in from energonai import QueueFullError, launch_engine File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/energonai-0.0.1+torch1.13cu11.7-py3.9-linux-x86_64.egg/energonai/init.py", line 2, in from .engine import launch_engine, SubmitEntry, QueueFullError File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/energonai-0.0.1+torch1.13cu11.7-py3.9-linux-x86_64.egg/energonai/engine.py", line 9, in from colossalai.logging import get_dist_logger File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/init.py", line 1, in from .initialize import ( File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/initialize.py", line 18, in from colossalai.amp import AMP_TYPE, convert_to_amp File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/amp/init.py", line 9, in from .torch_amp import convert_to_torch_amp File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/amp/torch_amp/init.py", line 9, in from .torch_amp import TorchAMPLoss, TorchAMPModel, TorchAMPOptimizer File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/amp/torch_amp/torch_amp.py", line 10, in from colossalai.nn.optimizer import ColossalaiOptimizer File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/nn/init.py", line 1, in from ._ops import File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/nn/_ops/init.py", line 1, in from .addmm import colo_addmm File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/nn/_ops/addmm.py", line 5, in from ._utils import GeneralTensor, Number, convert_to_colo_tensor File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/nn/_ops/_utils.py", line 8, in from colossalai.nn.layer.utils import divide File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/nn/layer/init.py", line 7, in from .moe import File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/nn/layer/moe/init.py", line 1, in from .experts import Experts, FFNExperts, TPExperts File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/nn/layer/moe/experts.py", line 8, in from colossalai.zero.init_ctx import no_shard_zero_decrator File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/zero/init.py", line 7, in from colossalai.zero.sharded_model.sharded_model_v2 import ShardedModelV2 File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/zero/sharded_model/init.py", line 1, in from .sharded_model_v2 import ShardedModelV2 File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/zero/sharded_model/sharded_model_v2.py", line 15, in from colossalai.gemini.memory_tracer import MemStatsCollector, StaticMemStatsCollector File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/gemini/init.py", line 1, in from .chunk import ChunkManager, TensorInfo, TensorState, search_chunk_configuration File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/gemini/chunk/init.py", line 3, in from .search_utils import classify_params_by_dp_degree, search_chunk_configuration File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/gemini/chunk/search_utils.py", line 8, in from colossalai.gemini.memory_tracer import MemStats, OrderedParamGenerator File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/gemini/memory_tracer/init.py", line 6, in from .static_memstats_collector import StaticMemStatsCollector # isort:skip File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/gemini/memory_tracer/static_memstats_collector.py", line 7, in from colossalai.fx.passes.meta_info_prop import MetaInfoProp File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/fx/init.py", line 4, in from .tracer import ColoTracer, meta_trace, symbolic_trace File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/fx/tracer/init.py", line 4, in from ._symbolic_trace import symbolic_trace File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/fx/tracer/_symbolic_trace.py", line 8, in from .tracer import ColoTracer File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/fx/tracer/tracer.py", line 23, in from .bias_addition_patch import func_to_func_dict, method_to_func_dict, module_to_func_dict File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/fx/tracer/bias_addition_patch/init.py", line 1, in from .patched_bias_addition_function import * File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function/init.py", line 1, in from .addbmm import Addbmm File "/usr/local/anaconda3/envs/pytorch/lib/python3.9/site-packages/colossalai-0.2.5-py3.9.egg/colossalai/fx/tracer/bias_addition_patch/patched_bias_addition_function/addbmm.py", line 7, in from .bias_addition_function import LinearBasedBiasFunc ModuleNotFoundError: No module named 'colossalai.fx.tracer.bias_addition_patch.patched_bias_addition_function.bias_addition_function'

Python 3.9.13

ver217 commented 1 year ago

Hi, could you reinstall the latest colossalai?

LemonSqi commented 1 year ago

reinstall the latest colossalai=0.2.7,but it's still error,details:

(base) [aigc@phy-22-124 opt]$ python opt_fastapi.py opt-125m --checkpoint /home/aigc/sqwang/model/restored.pt
Traceback (most recent call last): File "/home/aigc/sqwang/proj/EnergonAI-main/examples/opt/opt_fastapi.py", line 7, in from energonai import QueueFullError, launch_engine File "/home/aigc/anaconda3/lib/python3.9/site-packages/energonai-0.0.1+torch2.0cu11.7-py3.9-linux-x86_64.egg/energonai/init.py", line 2, in from .engine import launch_engine, SubmitEntry, QueueFullError File "/home/aigc/anaconda3/lib/python3.9/site-packages/energonai-0.0.1+torch2.0cu11.7-py3.9-linux-x86_64.egg/energonai/engine.py", line 9, in from colossalai.logging import get_dist_logger File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/init.py", line 1, in from .initialize import ( File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/initialize.py", line 18, in from colossalai.amp import AMP_TYPE, convert_to_amp File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/amp/init.py", line 9, in from .torch_amp import convert_to_torch_amp File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/amp/torch_amp/init.py", line 9, in from .torch_amp import TorchAMPLoss, TorchAMPModel, TorchAMPOptimizer File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/amp/torch_amp/torch_amp.py", line 10, in from colossalai.nn.optimizer import ColossalaiOptimizer File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/nn/init.py", line 1, in from ._ops import File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/nn/_ops/init.py", line 1, in from .addmm import colo_addmm File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/nn/_ops/addmm.py", line 5, in from ._utils import GeneralTensor, Number, convert_to_colo_tensor File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/nn/_ops/_utils.py", line 8, in from colossalai.nn.layer.utils import divide File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/nn/layer/init.py", line 7, in from .moe import File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/nn/layer/moe/init.py", line 1, in from .experts import Experts, FFNExperts, TPExperts File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/nn/layer/moe/experts.py", line 8, in from colossalai.zero.init_ctx import no_shard_zero_decrator File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/zero/init.py", line 7, in from colossalai.zero.sharded_model.sharded_model_v2 import ShardedModelV2 File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/zero/sharded_model/init.py", line 1, in from .sharded_model_v2 import ShardedModelV2 File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/zero/sharded_model/sharded_model_v2.py", line 16, in from colossalai.gemini.memory_tracer import MemStatsCollector, StaticMemStatsCollector File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/gemini/init.py", line 1, in from .chunk import ChunkManager, TensorInfo, TensorState, search_chunk_configuration File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/gemini/chunk/init.py", line 3, in from .search_utils import classify_params_by_dp_degree, search_chunk_configuration File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/gemini/chunk/search_utils.py", line 8, in from colossalai.gemini.memory_tracer import MemStats, OrderedParamGenerator File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/gemini/memory_tracer/init.py", line 6, in from .static_memstats_collector import StaticMemStatsCollector # isort:skip File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/gemini/memory_tracer/static_memstats_collector.py", line 7, in from colossalai.fx.passes.meta_info_prop import MetaInfoProp File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/fx/init.py", line 3, in from .passes import MetaInfoProp, metainfo_trace File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/fx/passes/init.py", line 2, in from .concrete_info_prop import ConcreteInfoProp File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/fx/passes/concrete_info_prop.py", line 10, in from colossalai.fx.profiler import GraphInfo, profile_function, profile_method, profile_module File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/fx/profiler/init.py", line 4, in from .opcount import flop_mapping File "/home/aigc/anaconda3/lib/python3.9/site-packages/colossalai-0.2.7-py3.9.egg/colossalai/fx/profiler/opcount.py", line 273, in aten.upsample_nearest2d_backward.vec: elementwise_flop_counter(0, 1), File "/home/aigc/anaconda3/lib/python3.9/site-packages/torch/_ops.py", line 488, in getattr raise AttributeError( AttributeError: The underlying op of 'aten.upsample_nearest2d_backward' has no overload name 'vec'

Mado007 commented 1 year ago

there is a conflict between two different versions of a function called index.Tensor in PyTorch and it seems that there is a problem with the addmm function in ColossalAI try updating your version of PyTorch to the latest version and then reinstalling EnergonAI

LemonSqi commented 1 year ago

the same error is occurred: e3f770c086f88c869c52f1dc7b77e94.jpg