nod-ai / SHARK-Turbine

Unified compiler/runtime for interfacing with PyTorch Dynamo.
Apache License 2.0
89 stars 41 forks source link

pytorch removed support for constraints in recent nightlies #532

Open dan-garvey opened 4 months ago

dan-garvey commented 4 months ago

see https://github.com/pytorch/pytorch/commit/342e7929b804ec56121e82e92d6a199b549c38b1.

(As i discovered when I hit this error: TypeError: forward() got an unexpected keyword argument 'constraints')

Its possible the replacement Dim-based specs as this commit references have existed for awhile and we may be able to migrate without bricking or conditioning support based on pytorch version.

@stellaraccident fyi

I might tinker with this a bit as a distraction unless you have some other vision

bailuan commented 4 months ago

So lowering the pytorch version is the only way at this time?

stellaraccident commented 4 months ago

Correct: this should only impact users of nightly pytorch for the moment. It will take us 1-2 weeks to adapt. The dynamo export APIs have been changing rapidly over the last months, and this is the cost of being on the bleeding edge.

bailuan commented 4 months ago

Could you please provide me a stable torch version to test turbine? I can't run over the sd_test.py completely under official version (2.2, 2.1).

stellaraccident commented 4 months ago

Official versions are the only things we maintain right control over. Nightlies break frequently and need updates. If I recall, this break was within the last two weeks.

bailuan commented 4 months ago

OK, I got that. So which version of pytorch you are using? v2.2.1, v2.2.0, v2.1.2 or others?

stellaraccident commented 4 months ago

The CI is presently pinned to: https://github.com/nod-ai/SHARK-Turbine/blob/main/core/pytorch-cpu-requirements.txt

We have various newer versions in play for different projects. We were just working on getting that upgraded to 2.2 when we got distracted by this nightly break.

bailuan commented 4 months ago

Thanks for your kindness, I am going to try. ^^

stellaraccident commented 4 months ago

Yw.

I hesitate to give you something even more experimental, but here is a test that exercises the very newest permutation of the torch.export APIs: https://github.com/nod-ai/SHARK-Turbine/blob/main/core/tests/aot/fx_programs_test.py

It is untested in CI right now but it's what I was developing to support the next version on very recent 2.3 nightlies. It doesn't have all features yet, but if it works for you, you are welcome to try it, but I can't promise not to change it as we finish support for 2.3.

bailuan commented 4 months ago

Hi, there is an error occurs when I try to run testExportVaeModelDecode in sd_test.py. Logs looks like below:

ERROR: testExportVaeModelDecode (main.StableDiffusionTest)

Traceback (most recent call last): File "/workspace/bailuan/20240315_sd/SHARK-Turbine/models/turbine_models/tests/sd_test.py", line 139, in testExportVaeModelDecode vae.export_vae_model( File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/turbine_models-0.9.5.dev1-py3.10.egg/turbine_models/custom_models/sd_inference/vae.py", line 112, in export_vae_model inst = CompiledVae(context=Context(), import_to=import_to) File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/compiled_module.py", line 552, in new do_export(proc_def) File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/compiled_module.py", line 549, in do_export trace.trace_py_func(invoke_with_self) File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/support/procedural/tracer.py", line 121, in trace_py_func return_py_value = _unproxy(py_f(*self.proxy_posargs, self.proxy_kwargs)) File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/compiled_module.py", line 530, in invoke_with_self return proc_def.callable(self, *args, *kwargs) File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/turbine_models-0.9.5.dev1-py3.10.egg/turbine_models/custom_models/sd_inference/vae.py", line 107, in main return jittable(vae_model.decode_inp)(inp) File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/support/procedural/base.py", line 137, in call return current_ir_trace().handle_call(self, args, kwargs) File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/support/procedural/tracer.py", line 137, in handle_call return target.resolve_call(self, args, kwargs) File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/builtins/jittable.py", line 220, in resolve_call gm, guards = exported_f(flat_pytorch_args) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_dynamo/eval_frame.py", line 1210, in inner graph = make_fx( File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 809, in wrapped t = dispatch_trace(wrap_key(func, args, fx_tracer, pre_dispatch), tracer=fx_tracer, concrete_args=tuple(phs)) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_compile.py", line 24, in inner return torch._dynamo.disable(fn, recursive)(args, kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_dynamo/eval_frame.py", line 328, in _fn return fn(*args, *kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_dynamo/external_utils.py", line 17, in inner return fn(args, kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 468, in dispatch_trace graph = tracer.trace(root, concrete_args) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_dynamo/eval_frame.py", line 328, in _fn return fn(*args, kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_dynamo/external_utils.py", line 17, in inner return fn(*args, kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/_symbolic_trace.py", line 817, in trace (self.create_arg(fn(args)),), File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 485, in wrapped out = f(tensors) File "", line 1, in File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_dynamo/eval_frame.py", line 1204, in graph_with_interpreter return torch.fx.Interpreter(graph).run(args) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/interpreter.py", line 138, in run self.env[node] = self.run_node(node) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/interpreter.py", line 195, in run_node return getattr(self, n.op)(n.target, args, kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/interpreter.py", line 267, in call_function return target(args, kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_ops.py", line 448, in call return self._op(*args, kwargs or {}) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/utils/_stats.py", line 20, in wrapper return fn(*args, kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 555, in __torch_dispatch return self.inner_torch_dispatch(func, types, args, kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 580, in inner_torch_dispatch return proxy_call(self, func, self.pre_dispatch, args, kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 262, in proxy_call r = CURRENT_DECOMPOSITION_TABLE[func](*args, **kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_refs/init__.py", line 3033, in native_group_norm out = out.view(input.shape) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/utils/_stats.py", line 20, in wrapper return fn(*args, kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 555, in __torch_dispatch__ return self.inner_torch_dispatch(func, types, args, kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 580, in inner_torch_dispatch return proxy_call(self, func, self.pre_dispatch, args, kwargs) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 304, in proxy_call proxy_args, proxy_kwargs = pytree.tree_map_only( File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/utils/_pytree.py", line 353, in tree_map_only return tree_map(map_only(ty)(fn), pytree) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/utils/_pytree.py", line 283, in tree_map return tree_unflatten([fn(i) for i in flat_args], spec) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/utils/_pytree.py", line 283, in return tree_unflatten([fn(i) for i in flat_args], spec) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/utils/_pytree.py", line 334, in inner return f(x) File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 234, in inner return get_proxy_slot(n, tracer)() File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 110, in get_proxy_slot raise RuntimeError(f"{obj} is not tracked with proxy for {tracer}") RuntimeError: 2*s1 is not tracked with proxy for <torch.fx.experimental.proxy_tensor.PythonKeyTracer object at 0x7fce737d77c0>

While executing %native_group_norm_default_11 : [num_users=1] = call_function[target=torch.ops.aten.native_group_norm.default](args = (%lself__tensor_constant3, %lself__param_constant56, %lself__param_constant57, 1, 512, 16384, 32, 1e-06), kwargs = {}) Original traceback: File ".0 from /workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py:477 in wrapped", line 239, in forward native_group_norm_11 = torch.ops.aten.native_group_norm.default(_tensor_constant3, _param_constant56, _param_constant57, 1, 512, mul_17, 32, 1e-06); _tensor_constant3 = _param_constant56 = _param_constant57 = mul_17 = None.

Can you reproduce it? shark-turbine version: f1c3d165db0dd4648f459dbf576e046e5b2555b4 pytorch version: 2.1.0a0+git7bcf7da

dan-garvey commented 4 months ago

For SD, I believe the team working on it is using a 2.3 from probably late feb. @monorimet may be able to comment

As Stella mentioned, this is the bleeding part of bleeding edge

bailuan commented 4 months ago

Thanks, could @monorimet provide me some advice? Did you encounter this error under your torch environment?