hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible
https://www.colossalai.org
Apache License 2.0
38.7k stars 4.34k forks source link

[BUG]: TypeError: _gen_python_code() got an unexpected keyword argument 'verbose' #5673

Open Xingzhi107 opened 5 months ago

Xingzhi107 commented 5 months ago

🐛 Describe the bug

When I run examples/language/gpt/experiments/auto_parallel/auto_parallel_with_gpt.py,The error is as follows File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 265, in initialize_model gm = ColoGraphModule(model, graph, model.class.name) File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 110, in init super().init(root, graph, class_name) File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 385, in init self.graph = graph File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1754, in setattr super().setattr(name, value) File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 426, in graph self.recompile() File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 141, in recompile python_code = self._graph.python_code(root_module="self",verbose=True) File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1268, in python_code return self._python_code(root_module, namespace, verbose=verbose) File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1271, in _python_code return self._codegen._gen_python_code(self.nodes, root_module, namespace, verbose=verbose) TypeError: _gen_python_code() got an unexpected keyword argument 'verbose'

Environment

torch==2.1.0 colossalai==0.3.6

Edenzzzz commented 5 months ago

Thanks for your issue. Could you try pulling the most recent repo? I fixed this last week.

image
Xingzhi107 commented 5 months ago

Thanks for the answer, I pulled colossalai==0.3.7, when torch==2.2.1,The following error occurs

File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 355, in autoparallelize
    rst_to_unpack = initialize_model(
  File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 265, in initialize_model
    gm = ColoGraphModule(model, graph, model.__class__.__name__)
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 110, in __init__
    super().__init__(root, graph, class_name)
  File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 428, in __init__
    self.graph = graph
  File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1747, in __setattr__
    super().__setattr__(name, value)
  File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 472, in graph
    self.recompile()
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 141, in recompile
    python_code = self._graph.python_code(root_module="self")
  File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1328, in python_code
    return self._python_code(root_module, namespace, verbose=verbose)
  File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1331, in _python_code
    return self._codegen._gen_python_code(self.nodes, root_module, namespace, verbose=verbose)
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/codegen.py", line 472, in _gen_python_code
    return PythonCode(fn_code, globals_)
TypeError: __init__() missing 1 required positional argument: '_lineno_map'

when the torch==2.1.1,The following error occurs

  File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 355, in autoparallelize
    rst_to_unpack = initialize_model(
  File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 267, in initialize_model
    shape_prop_pass(gm, *meta_args.values())
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 269, in shape_prop_pass
    ShapeProp(module).propagate(*args, device=_current_device(module))
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 253, in propagate
    return super().run(*tree_map(wrap_fn, args))
  File "/opt/conda/lib/python3.9/site-packages/torch/fx/interpreter.py", line 138, in run
    self.env[node] = self.run_node(node)
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 116, in run_node
    r = getattr(self, n.op)(n.target, args, kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/fx/interpreter.py", line 312, in call_module
    return submod(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/sparse.py", line 162, in forward
    return F.embedding(
  File "/opt/conda/lib/python3.9/site-packages/torch/nn/functional.py", line 2202, in embedding
    return handle_torch_function(
  File "/opt/conda/lib/python3.9/site-packages/torch/overrides.py", line 1577, in handle_torch_function
    result = torch_func_method(public_api, types, args, kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/_tensor.py", line 1386, in __torch_function__
    ret = func(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/nn/functional.py", line 2233, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 113, in __torch_dispatch__
    ret = func(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/_ops.py", line 448, in __call__
    return self._op(*args, **kwargs or {})
  File "/opt/conda/lib/python3.9/site-packages/torch/_decomp/decompositions.py", line 1141, in embedding
    return weight[indices]
  File "/opt/conda/lib/python3.9/site-packages/torch/_meta_registrations.py", line 2790, in meta_index_Tensor
    return self.new_empty(before_shape + replacement_shape + after_shape)
  File "/opt/conda/lib/python3.9/site-packages/torch/_refs/__init__.py", line 4483, in new_empty
    return torch.empty(
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 188, in _new
    return MetaTensor(
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 60, in __new__
    r = torch.Tensor._make_wrapper_subclass(
RuntimeError: !check_has_torch_dispatch(obj) INTERNAL ASSERT FAILED at "../torch/csrc/autograd/python_variable.cpp":1934, please report a bug to PyTorch. While HermeticPyObject was enabled, we attempted to create a tensor subclass with __torch_dispatch__.  This violates the invariant that operations in HermeticPyObject have equivalent C++ implementations. If your operator registered from Python operator registration isn't doing anything strange, there may be an internal PyTorch bug involving not appropriately disabling TorchDispatchMode before executing Python op registration.

While executing %transformer_wte : [num_users=1] = call_module[target=transformer.wte](args = (%view,), kwargs = {})
Original traceback:
None
wxthu commented 2 months ago

Thanks for the answer, I pulled colossalai==0.3.7, when torch==2.2.1,The following error occurs

File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 355, in autoparallelize
    rst_to_unpack = initialize_model(
  File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 265, in initialize_model
    gm = ColoGraphModule(model, graph, model.__class__.__name__)
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 110, in __init__
    super().__init__(root, graph, class_name)
  File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 428, in __init__
    self.graph = graph
  File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1747, in __setattr__
    super().__setattr__(name, value)
  File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph_module.py", line 472, in graph
    self.recompile()
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/graph_module.py", line 141, in recompile
    python_code = self._graph.python_code(root_module="self")
  File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1328, in python_code
    return self._python_code(root_module, namespace, verbose=verbose)
  File "/opt/conda/lib/python3.9/site-packages/torch/fx/graph.py", line 1331, in _python_code
    return self._codegen._gen_python_code(self.nodes, root_module, namespace, verbose=verbose)
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/codegen.py", line 472, in _gen_python_code
    return PythonCode(fn_code, globals_)
TypeError: __init__() missing 1 required positional argument: '_lineno_map'

when the torch==2.1.1,The following error occurs

  File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 355, in autoparallelize
    rst_to_unpack = initialize_model(
  File "/opt/conda/lib/python3.9/site-packages/colossalai/auto_parallel/tensor_shard/initialize.py", line 267, in initialize_model
    shape_prop_pass(gm, *meta_args.values())
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 269, in shape_prop_pass
    ShapeProp(module).propagate(*args, device=_current_device(module))
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 253, in propagate
    return super().run(*tree_map(wrap_fn, args))
  File "/opt/conda/lib/python3.9/site-packages/torch/fx/interpreter.py", line 138, in run
    self.env[node] = self.run_node(node)
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/fx/passes/shape_prop.py", line 116, in run_node
    r = getattr(self, n.op)(n.target, args, kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/fx/interpreter.py", line 312, in call_module
    return submod(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/nn/modules/sparse.py", line 162, in forward
    return F.embedding(
  File "/opt/conda/lib/python3.9/site-packages/torch/nn/functional.py", line 2202, in embedding
    return handle_torch_function(
  File "/opt/conda/lib/python3.9/site-packages/torch/overrides.py", line 1577, in handle_torch_function
    result = torch_func_method(public_api, types, args, kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/_tensor.py", line 1386, in __torch_function__
    ret = func(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/nn/functional.py", line 2233, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 113, in __torch_dispatch__
    ret = func(*args, **kwargs)
  File "/opt/conda/lib/python3.9/site-packages/torch/_ops.py", line 448, in __call__
    return self._op(*args, **kwargs or {})
  File "/opt/conda/lib/python3.9/site-packages/torch/_decomp/decompositions.py", line 1141, in embedding
    return weight[indices]
  File "/opt/conda/lib/python3.9/site-packages/torch/_meta_registrations.py", line 2790, in meta_index_Tensor
    return self.new_empty(before_shape + replacement_shape + after_shape)
  File "/opt/conda/lib/python3.9/site-packages/torch/_refs/__init__.py", line 4483, in new_empty
    return torch.empty(
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 188, in _new
    return MetaTensor(
  File "/opt/conda/lib/python3.9/site-packages/colossalai/_analyzer/_subclasses/meta_tensor.py", line 60, in __new__
    r = torch.Tensor._make_wrapper_subclass(
RuntimeError: !check_has_torch_dispatch(obj) INTERNAL ASSERT FAILED at "../torch/csrc/autograd/python_variable.cpp":1934, please report a bug to PyTorch. While HermeticPyObject was enabled, we attempted to create a tensor subclass with __torch_dispatch__.  This violates the invariant that operations in HermeticPyObject have equivalent C++ implementations. If your operator registered from Python operator registration isn't doing anything strange, there may be an internal PyTorch bug involving not appropriately disabling TorchDispatchMode before executing Python op registration.

While executing %transformer_wte : [num_users=1] = call_module[target=transformer.wte](args = (%view,), kwargs = {})
Original traceback:
None

I also met this issue ... Could you give some suggestions? @Edenzzzz