Open Xingzhi107 opened 5 months ago
Thanks for your issue. Could you try pulling the most recent repo? I fixed this last week.
Thanks for the answer, I pulled colossalai==0.3.7, when torch==2.2.1,The following error occurs
File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 355, in autoparallelize
rst_to_unpack = initialize_model(
File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 265, in initialize_model
gm = ColoGraphModule(model, graph, model.__class__.__name__)
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 110, in __init__
super().__init__(root, graph, class_name)
File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 428, in __init__
self.graph = graph
File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1747, in __setattr__
super().__setattr__(name, value)
File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 472, in graph
self.recompile()
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 141, in recompile
python_code = self._graph.python_code(root_module="self")
File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1328, in python_code
return self._python_code(root_module, namespace, verbose=verbose)
File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1331, in _python_code
return self._codegen._gen_python_code(self.nodes, root_module, namespace, verbose=verbose)
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/codegen.py", line 472, in _gen_python_code
return PythonCode(fn_code, globals_)
TypeError: __init__() missing 1 required positional argument: '_lineno_map'
when the torch==2.1.1,The following error occurs
File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 355, in autoparallelize
rst_to_unpack = initialize_model(
File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 267, in initialize_model
shape_prop_pass(gm, *meta_args.values())
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 269, in shape_prop_pass
ShapeProp(module).propagate(*args, device=_current_device(module))
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 253, in propagate
return super().run(*tree_map(wrap_fn, args))
File "/opt/conda/lib/python3.9/site-packages/torch/fx/interpreter.py", line 138, in run
self.env[node] = self.run_node(node)
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 116, in run_node
r = getattr(self, n.op)(n.target, args, kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/fx/interpreter.py", line 312, in call_module
return submod(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/sparse.py", line 162, in forward
return F.embedding(
File "/opt/conda/lib/python3.9/site-packages/torch/nn/functional.py", line 2202, in embedding
return handle_torch_function(
File "/opt/conda/lib/python3.9/site-packages/torch/overrides.py", line 1577, in handle_torch_function
result = torch_func_method(public_api, types, args, kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/_tensor.py", line 1386, in __torch_function__
ret = func(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/nn/functional.py", line 2233, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 113, in __torch_dispatch__
ret = func(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/torch/_ops.py", line 448, in __call__
return self._op(*args, **kwargs or {})
File "/opt/conda/lib/python3.9/site-packages/torch/_decomp/decompositions.py", line 1141, in embedding
return weight[indices]
File "/opt/conda/lib/python3.9/site-packages/torch/_meta_registrations.py", line 2790, in meta_index_Tensor
return self.new_empty(before_shape + replacement_shape + after_shape)
File "/opt/conda/lib/python3.9/site-packages/torch/_refs/__init__.py", line 4483, in new_empty
return torch.empty(
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 188, in _new
return MetaTensor(
File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 60, in __new__
r = torch.Tensor._make_wrapper_subclass(
RuntimeError: !check_has_torch_dispatch(obj) INTERNAL ASSERT FAILED at "../torch/csrc/autograd/python_variable.cpp":1934, please report a bug to PyTorch. While HermeticPyObject was enabled, we attempted to create a tensor subclass with __torch_dispatch__. This violates the invariant that operations in HermeticPyObject have equivalent C++ implementations. If your operator registered from Python operator registration isn't doing anything strange, there may be an internal PyTorch bug involving not appropriately disabling TorchDispatchMode before executing Python op registration.
While executing %transformer_wte : [num_users=1] = call_module[target=transformer.wte](args = (%view,), kwargs = {})
Original traceback:
None
Thanks for the answer, I pulled colossalai==0.3.7, when torch==2.2.1,The following error occurs
File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 355, in autoparallelize rst_to_unpack = initialize_model( File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 265, in initialize_model gm = ColoGraphModule(model, graph, model.__class__.__name__) File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 110, in __init__ super().__init__(root, graph, class_name) File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 428, in __init__ self.graph = graph File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1747, in __setattr__ super().__setattr__(name, value) File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 472, in graph self.recompile() File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 141, in recompile python_code = self._graph.python_code(root_module="self") File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1328, in python_code return self._python_code(root_module, namespace, verbose=verbose) File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1331, in _python_code return self._codegen._gen_python_code(self.nodes, root_module, namespace, verbose=verbose) File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/codegen.py", line 472, in _gen_python_code return PythonCode(fn_code, globals_) TypeError: __init__() missing 1 required positional argument: '_lineno_map'
when the torch==2.1.1,The following error occurs
File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 355, in autoparallelize rst_to_unpack = initialize_model( File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 267, in initialize_model shape_prop_pass(gm, *meta_args.values()) File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 269, in shape_prop_pass ShapeProp(module).propagate(*args, device=_current_device(module)) File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 253, in propagate return super().run(*tree_map(wrap_fn, args)) File "/opt/conda/lib/python3.9/site-packages/torch/fx/interpreter.py", line 138, in run self.env[node] = self.run_node(node) File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 116, in run_node r = getattr(self, n.op)(n.target, args, kwargs) File "/opt/conda/lib/python3.9/site-packages/torch/fx/interpreter.py", line 312, in call_module return submod(*args, **kwargs) File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/sparse.py", line 162, in forward return F.embedding( File "/opt/conda/lib/python3.9/site-packages/torch/nn/functional.py", line 2202, in embedding return handle_torch_function( File "/opt/conda/lib/python3.9/site-packages/torch/overrides.py", line 1577, in handle_torch_function result = torch_func_method(public_api, types, args, kwargs) File "/opt/conda/lib/python3.9/site-packages/torch/_tensor.py", line 1386, in __torch_function__ ret = func(*args, **kwargs) File "/opt/conda/lib/python3.9/site-packages/torch/nn/functional.py", line 2233, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 113, in __torch_dispatch__ ret = func(*args, **kwargs) File "/opt/conda/lib/python3.9/site-packages/torch/_ops.py", line 448, in __call__ return self._op(*args, **kwargs or {}) File "/opt/conda/lib/python3.9/site-packages/torch/_decomp/decompositions.py", line 1141, in embedding return weight[indices] File "/opt/conda/lib/python3.9/site-packages/torch/_meta_registrations.py", line 2790, in meta_index_Tensor return self.new_empty(before_shape + replacement_shape + after_shape) File "/opt/conda/lib/python3.9/site-packages/torch/_refs/__init__.py", line 4483, in new_empty return torch.empty( File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 188, in _new return MetaTensor( File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 60, in __new__ r = torch.Tensor._make_wrapper_subclass( RuntimeError: !check_has_torch_dispatch(obj) INTERNAL ASSERT FAILED at "../torch/csrc/autograd/python_variable.cpp":1934, please report a bug to PyTorch. While HermeticPyObject was enabled, we attempted to create a tensor subclass with __torch_dispatch__. This violates the invariant that operations in HermeticPyObject have equivalent C++ implementations. If your operator registered from Python operator registration isn't doing anything strange, there may be an internal PyTorch bug involving not appropriately disabling TorchDispatchMode before executing Python op registration. While executing %transformer_wte : [num_users=1] = call_module[target=transformer.wte](args = (%view,), kwargs = {}) Original traceback: None
I also met this issue ... Could you give some suggestions? @Edenzzzz
🐛 Describe the bug
When I run examples/language/gpt/experiments/auto_parallel/auto_parallel_with_gpt.py,The error is as follows File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 265, in initialize_model gm = ColoGraphModule(model, graph, model.class.name) File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 110, in init super().init(root, graph, class_name) File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 385, in init self.graph = graph File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1754, in setattr super().setattr(name, value) File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 426, in graph self.recompile() File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 141, in recompile python_code = self._graph.python_code(root_module="self",verbose=True) File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1268, in python_code return self._python_code(root_module, namespace, verbose=verbose) File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1271, in _python_code return self._codegen._gen_python_code(self.nodes, root_module, namespace, verbose=verbose) TypeError: _gen_python_code() got an unexpected keyword argument 'verbose'
Environment
torch==2.1.0 colossalai==0.3.6