> @forestlet This is because of the force_upcast of vae. You need execute the next code before load_pipe:
if pipe.vae.dtype == torch.float16 and pipe.vae.config.force_upcast:
pipe.upcast_vae()
And we will integrate this behavior into the load_pipe function in PR-734
THANKs! However...
😢 I tried SVD and I found there is no upcast_vae() for SVD pipe.
So I checked the doc: onediff_diffusers_extensions/onediffx/deep_cache/pipeline_stable_video_diffusion.py and I tried
if pipe.vae.dtype == torch.float16 and pipe.vae.config.force_upcast:
pipe.vae.to(dtype=torch.float32)
load_pipe(pipe, dir="cached_pipe")
And I got this:
/home/ubuntu/.local/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
libibverbs not available, ibv_fork_init skipped
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 9.09it/s]
[ERROR](GRAPH:OneflowGraph_3:OneflowGraph) run got error: <class 'oneflow._oneflow_internal.exception.Exception'> InferDataType Failed. Expected kFloat16, but got kFloat
File "oneflow/core/job/job_interpreter.cpp", line 312, in InterpretJob
RunNormalOp(launch_context, launch_op, inputs)
File "oneflow/core/job/job_interpreter.cpp", line 224, in RunNormalOp
it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))
File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret
[&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()
File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()
user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)
File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer
Infer(*user_op_expr, infer_args)
File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer
user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })
File "oneflow/core/framework/op_expr.cpp", line 603, in InferPhysicalTensorDesc
dtype_infer_fn_(&infer_ctx)
File "oneflow/user/ops/group_norm_op.cpp", line 85, in InferDataType
CHECK_EQ_OR_RETURN(gamma.data_type(), x.data_type())
Error Type: oneflow.ErrorProto.check_failed_error
Traceback (most recent call last):
File "/home/ubuntu/filmacton/video_gen/load_compiled_pipe.py", line 18, in <module>
load_pipe(pipe, dir="cached_pipe")
File "/home/ubuntu/.local/lib/python3.10/site-packages/onediffx/compilers/diffusion_pipeline_compiler.py", line 100, in load_pipe
obj.load_graph(os.path.join(dir, part))
File "/home/ubuntu/.local/lib/python3.10/site-packages/onediff/infer_compiler/with_oneflow_compile.py", line 322, in load_graph
self.get_graph().load_graph(file_path, device, run_warmup)
File "/home/ubuntu/.local/lib/python3.10/site-packages/onediff/infer_compiler/utils/cost_util.py", line 48, in clocked
return func(*args, **kwargs)
File "/home/ubuntu/.local/lib/python3.10/site-packages/onediff/infer_compiler/with_oneflow_compile.py", line 349, in load_graph
self.load_runtime_state_dict(state_dict, warmup_with_run=run_warmup)
File "/home/ubuntu/.local/lib/python3.10/site-packages/oneflow/nn/graph/graph.py", line 1188, in load_runtime_state_dict
return self._dynamic_input_graph_cache.load_runtime_state_dict(
File "/home/ubuntu/.local/lib/python3.10/site-packages/oneflow/nn/graph/cache.py", line 242, in load_runtime_state_dict
graph.load_runtime_state_dict(
File "/home/ubuntu/.local/lib/python3.10/site-packages/oneflow/nn/graph/graph.py", line 1348, in load_runtime_state_dict
self.__run(
File "/home/ubuntu/.local/lib/python3.10/site-packages/oneflow/nn/graph/graph.py", line 1865, in __run
_eager_outputs = oneflow._oneflow_internal.nn.graph.RunLazyNNGraphByVM(
oneflow._oneflow_internal.exception.Exception: InferDataType Failed. Expected kFloat16, but got kFloat
File "oneflow/core/job/job_interpreter.cpp", line 312, in InterpretJob
RunNormalOp(launch_context, launch_op, inputs)
File "oneflow/core/job/job_interpreter.cpp", line 224, in RunNormalOp
it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))
File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret
[&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()
File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()
user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)
File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer
Infer(*user_op_expr, infer_args)
File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer
user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })
File "oneflow/core/framework/op_expr.cpp", line 603, in InferPhysicalTensorDesc
dtype_infer_fn_(&infer_ctx)
File "oneflow/user/ops/group_norm_op.cpp", line 85, in InferDataType
CHECK_EQ_OR_RETURN(gamma.data_type(), x.data_type())
Error Type: oneflow.ErrorProto.check_failed_error
THANKs! However... 😢 I tried SVD and I found there is no upcast_vae() for SVD pipe. So I checked the doc: onediff_diffusers_extensions/onediffx/deep_cache/pipeline_stable_video_diffusion.py and I tried
And I got this:
Originally posted by @forestlet in https://github.com/siliconflow/onediff/issues/717#issuecomment-2002016351