siliconflow / onediff

OneDiff: An out-of-the-box acceleration library for diffusion models.
https://github.com/siliconflow/onediff/wiki
Apache License 2.0
1.4k stars 85 forks source link

no upcast_vae() for SVD pipe #978

Open strint opened 4 days ago

strint commented 4 days ago
          > @forestlet This is because of the force_upcast of vae. You need execute the next code before load_pipe:
if pipe.vae.dtype == torch.float16 and pipe.vae.config.force_upcast:
   pipe.upcast_vae()

And we will integrate this behavior into the load_pipe function in PR-734

THANKs! However... 😢 I tried SVD and I found there is no upcast_vae() for SVD pipe. So I checked the doc: onediff_diffusers_extensions/onediffx/deep_cache/pipeline_stable_video_diffusion.py and I tried

if pipe.vae.dtype == torch.float16 and pipe.vae.config.force_upcast:
    pipe.vae.to(dtype=torch.float32)

load_pipe(pipe, dir="cached_pipe")

And I got this:

/home/ubuntu/.local/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
libibverbs not available, ibv_fork_init skipped
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████| 5/5 [00:00<00:00,  9.09it/s]
[ERROR](GRAPH:OneflowGraph_3:OneflowGraph) run got error: <class 'oneflow._oneflow_internal.exception.Exception'> InferDataType Failed. Expected kFloat16, but got kFloat
  File "oneflow/core/job/job_interpreter.cpp", line 312, in InterpretJob
    RunNormalOp(launch_context, launch_op, inputs)
  File "oneflow/core/job/job_interpreter.cpp", line 224, in RunNormalOp
    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret
    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()
    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer
    Infer(*user_op_expr, infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer
    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })
  File "oneflow/core/framework/op_expr.cpp", line 603, in InferPhysicalTensorDesc
    dtype_infer_fn_(&infer_ctx)
  File "oneflow/user/ops/group_norm_op.cpp", line 85, in InferDataType
    CHECK_EQ_OR_RETURN(gamma.data_type(), x.data_type())
Error Type: oneflow.ErrorProto.check_failed_error
Traceback (most recent call last):
  File "/home/ubuntu/filmacton/video_gen/load_compiled_pipe.py", line 18, in <module>
    load_pipe(pipe, dir="cached_pipe")
  File "/home/ubuntu/.local/lib/python3.10/site-packages/onediffx/compilers/diffusion_pipeline_compiler.py", line 100, in load_pipe
    obj.load_graph(os.path.join(dir, part))
  File "/home/ubuntu/.local/lib/python3.10/site-packages/onediff/infer_compiler/with_oneflow_compile.py", line 322, in load_graph
    self.get_graph().load_graph(file_path, device, run_warmup)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/onediff/infer_compiler/utils/cost_util.py", line 48, in clocked
    return func(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/onediff/infer_compiler/with_oneflow_compile.py", line 349, in load_graph
    self.load_runtime_state_dict(state_dict, warmup_with_run=run_warmup)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/oneflow/nn/graph/graph.py", line 1188, in load_runtime_state_dict
    return self._dynamic_input_graph_cache.load_runtime_state_dict(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/oneflow/nn/graph/cache.py", line 242, in load_runtime_state_dict
    graph.load_runtime_state_dict(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/oneflow/nn/graph/graph.py", line 1348, in load_runtime_state_dict
    self.__run(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/oneflow/nn/graph/graph.py", line 1865, in __run
    _eager_outputs = oneflow._oneflow_internal.nn.graph.RunLazyNNGraphByVM(
oneflow._oneflow_internal.exception.Exception: InferDataType Failed. Expected kFloat16, but got kFloat
  File "oneflow/core/job/job_interpreter.cpp", line 312, in InterpretJob
    RunNormalOp(launch_context, launch_op, inputs)
  File "oneflow/core/job/job_interpreter.cpp", line 224, in RunNormalOp
    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret
    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()
    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer
    Infer(*user_op_expr, infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer
    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })
  File "oneflow/core/framework/op_expr.cpp", line 603, in InferPhysicalTensorDesc
    dtype_infer_fn_(&infer_ctx)
  File "oneflow/user/ops/group_norm_op.cpp", line 85, in InferDataType
    CHECK_EQ_OR_RETURN(gamma.data_type(), x.data_type())
Error Type: oneflow.ErrorProto.check_failed_error

Originally posted by @forestlet in https://github.com/siliconflow/onediff/issues/717#issuecomment-2002016351

strint commented 4 days ago

@forestlet does this problem still exist?