DeepCache error and recompilation in ComfyUI

基于 issue https://github.com/siliconflow/onediff/issues/778 做了一些测试。

一、使用 Onediff 的 ModuleDeepCacheSpeedup 节点，发现：

启动 ComfyUI：
第一次生成 720 x 720 图片 (成功) -> 第二次生成 728 x 720 图片 (报错重编译)
第一次生成 544 x 544 图片 (成功) -> 第二次生成 576 x 544 图片 (报错重编译) 
第一次生成 512 x 512 图片 (成功) -> 第二次生成 544 x 512 图片 (报错重编译)
第一次生成 512 x 512 图片 (成功) -> 第二次生成 576 x 512 图片 (成功)

可以发现：只有当第一次生成图片的维度是 64 倍并且之后图片的维度仍然保持 64 倍时，Onediff 的 DeepCache 才能正常使用。

二、如果将 ModuleDeepCacheSpeedup 节点换成 ComfyUI-DeepCache 节点（https://github.com/styler00dollar/ComfyUI-deepcache），一切正常。如果去掉 Oneflow 对 unet 的编译（如下所示），一切正常。

# https://github.com/siliconflow/onediff/blob/425be50e7a14ef3cb4b9245246483e1e40bd6edf/onediff_comfy_nodes/modules/oneflow/utils/deep_cache_speedup.py#L38
model_patcher.deep_cache_unet = DeepCacheUNet(
    model_patcher.model.diffusion_model, cache_layer_id, cache_block_id
)
model_patcher.fast_deep_cache_unet = FastDeepCacheUNet(
    model_patcher.model.diffusion_model, cache_layer_id, cache_block_id
)
# model_patcher.deep_cache_unet = oneflow_compile(
#     model_patcher.deep_cache_unet
# )
# model_patcher.fast_deep_cache_unet = oneflow_compile(
#     model_patcher.fast_deep_cache_unet
# )

三、报错信息如下，还是落在了 concat op 上（报错结束后会重编译）。

[ERROR](GRAPH:OneflowGraph_0:OneflowGraph) run got error: <class 'oneflow._oneflow_internal.exception.Exception'> Check failed: (18 == 17) 
  File "oneflow/core/job/job_interpreter.cpp", line 325, in InterpretJob
    RunNormalOp(launch_context, launch_op, inputs)
  File "oneflow/core/job/job_interpreter.cpp", line 237, in RunNormalOp
    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret
    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()
    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer
    Infer(*user_op_expr, infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer
    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })
  File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc
    physical_tensor_desc_infer_fn_(&infer_ctx)
  File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc
    CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))
Error Type: oneflow.ErrorProto.check_failed_error
Exception in forward: e=Exception('Check failed: (18 == 17) \n  File "oneflow/core/job/job_interpreter.cpp", line 325, in InterpretJob\n    RunNormalOp(launch_context, launch_op, inputs)\n  File "oneflow/core/job/job_interpreter.cpp", line 237, in RunNormalOp\n    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))\n  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret\n    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()\n  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()\n    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)\n  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer\n    Infer(*user_op_expr, infer_args)\n  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer\n    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })\n  File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc\n    physical_tensor_desc_infer_fn_(&infer_ctx)\n  File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc\n    CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))\nError Type: oneflow.ErrorProto.check_failed_error')
Recompile oneflow module ...

siliconflow / onediff

DeepCache error and recompilation in ComfyUI #857