Open dan-garvey opened 4 months ago
So lowering the pytorch version is the only way at this time?
Correct: this should only impact users of nightly pytorch for the moment. It will take us 1-2 weeks to adapt. The dynamo export APIs have been changing rapidly over the last months, and this is the cost of being on the bleeding edge.
Could you please provide me a stable torch version to test turbine? I can't run over the sd_test.py completely under official version (2.2, 2.1).
Official versions are the only things we maintain right control over. Nightlies break frequently and need updates. If I recall, this break was within the last two weeks.
OK, I got that. So which version of pytorch you are using? v2.2.1, v2.2.0, v2.1.2 or others?
The CI is presently pinned to: https://github.com/nod-ai/SHARK-Turbine/blob/main/core/pytorch-cpu-requirements.txt
We have various newer versions in play for different projects. We were just working on getting that upgraded to 2.2 when we got distracted by this nightly break.
Thanks for your kindness, I am going to try. ^^
Yw.
I hesitate to give you something even more experimental, but here is a test that exercises the very newest permutation of the torch.export APIs: https://github.com/nod-ai/SHARK-Turbine/blob/main/core/tests/aot/fx_programs_test.py
It is untested in CI right now but it's what I was developing to support the next version on very recent 2.3 nightlies. It doesn't have all features yet, but if it works for you, you are welcome to try it, but I can't promise not to change it as we finish support for 2.3.
Traceback (most recent call last):
File "/workspace/bailuan/20240315_sd/SHARK-Turbine/models/turbine_models/tests/sd_test.py", line 139, in testExportVaeModelDecode
vae.export_vae_model(
File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/turbine_models-0.9.5.dev1-py3.10.egg/turbine_models/custom_models/sd_inference/vae.py", line 112, in export_vae_model
inst = CompiledVae(context=Context(), import_to=import_to)
File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/compiled_module.py", line 552, in new
do_export(proc_def)
File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/compiled_module.py", line 549, in do_export
trace.trace_py_func(invoke_with_self)
File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/support/procedural/tracer.py", line 121, in trace_py_func
return_py_value = _unproxy(py_f(*self.proxy_posargs, self.proxy_kwargs))
File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/compiled_module.py", line 530, in invoke_with_self
return proc_def.callable(self, *args, *kwargs)
File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/turbine_models-0.9.5.dev1-py3.10.egg/turbine_models/custom_models/sd_inference/vae.py", line 107, in main
return jittable(vae_model.decode_inp)(inp)
File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/support/procedural/base.py", line 137, in call
return current_ir_trace().handle_call(self, args, kwargs)
File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/support/procedural/tracer.py", line 137, in handle_call
return target.resolve_call(self, args, kwargs)
File "/workspace/bailuan/20240105_iree_chiptech/sd_shark/lib/python3.10/site-packages/shark_turbine-0.9.5.dev1-py3.10.egg/shark_turbine/aot/builtins/jittable.py", line 220, in resolve_call
gm, guards = exported_f(flat_pytorch_args)
File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_dynamo/eval_frame.py", line 1210, in inner
graph = make_fx(
File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 809, in wrapped
t = dispatch_trace(wrap_key(func, args, fx_tracer, pre_dispatch), tracer=fx_tracer, concrete_args=tuple(phs))
File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_compile.py", line 24, in inner
return torch._dynamo.disable(fn, recursive)(args, kwargs)
File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_dynamo/eval_frame.py", line 328, in _fn
return fn(*args, *kwargs)
File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_dynamo/external_utils.py", line 17, in inner
return fn(args, kwargs)
File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 468, in dispatch_trace
graph = tracer.trace(root, concrete_args)
File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_dynamo/eval_frame.py", line 328, in _fn
return fn(*args, kwargs)
File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/_dynamo/external_utils.py", line 17, in inner
return fn(*args, kwargs)
File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/_symbolic_trace.py", line 817, in trace
(self.create_arg(fn(args)),),
File "/workspace/bailuan/20231221_pytorch_install/pytorch/torch/fx/experimental/proxy_tensor.py", line 485, in wrapped
out = f(tensors)
File "
While executing %native_group_norm_default_11 : [num_users=1] = call_function[target=torch.ops.aten.native_group_norm.default](args = (%lself__tensor_constant3, %lself__param_constant56, %lself__param_constant57, 1, 512, 16384, 32, 1e-06), kwargs = {})
Original traceback:
File "
Can you reproduce it? shark-turbine version: f1c3d165db0dd4648f459dbf576e046e5b2555b4 pytorch version: 2.1.0a0+git7bcf7da
For SD, I believe the team working on it is using a 2.3 from probably late feb. @monorimet may be able to comment
As Stella mentioned, this is the bleeding part of bleeding edge
Thanks, could @monorimet provide me some advice? Did you encounter this error under your torch environment?
see https://github.com/pytorch/pytorch/commit/342e7929b804ec56121e82e92d6a199b549c38b1.
(As i discovered when I hit this error:
TypeError: forward() got an unexpected keyword argument 'constraints'
)Its possible the replacement
Dim
-based specs as this commit references have existed for awhile and we may be able to migrate without bricking or conditioning support based on pytorch version.@stellaraccident fyi
I might tinker with this a bit as a distraction unless you have some other vision