PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.13k stars 5.55k forks source link

动转静不支持broadcast_to算子 #60780

Closed westfish closed 8 months ago

westfish commented 8 months ago

bug描述 Describe the Bug

在动转静时,遇到以下代码报错

        time_context = time_context_first_timestep[None, :].broadcast_to(
            [height * width, batch_size, 1, time_context.shape[-1]]
        )

具体报错为

    error_data.raise_new_exception()
  File "/root/miniconda3/envs/paddle-develop/lib/python3.9/site-packages/paddle/jit/dy2static/error.py", line 452, in raise_new_exception
    raise new_exception from None
AssertionError: In transformed code:

    File "/root/project/paddlemix/upgrade_ppdiffuser0240/PaddleMIX/ppdiffusers/ppdiffusers/models/unet_spatio_temporal_condition.py", line 446, in forward
        for downsample_block in self.down_blocks:
    File "/root/project/paddlemix/upgrade_ppdiffuser0240/PaddleMIX/ppdiffusers/ppdiffusers/models/unet_spatio_temporal_condition.py", line 447, in forward
        if hasattr(downsample_block, "has_cross_attention") and downsample_block.has_cross_attention:
    File "/root/project/paddlemix/upgrade_ppdiffuser0240/PaddleMIX/ppdiffusers/ppdiffusers/models/unet_spatio_temporal_condition.py", line 448, in forward
        sample, res_samples = downsample_block(
    File "/root/project/paddlemix/upgrade_ppdiffuser0240/PaddleMIX/ppdiffusers/ppdiffusers/models/unet_3d_blocks.py", line 2129, in forward
        for resnet, attn in blocks:
    File "/root/project/paddlemix/upgrade_ppdiffuser0240/PaddleMIX/ppdiffusers/ppdiffusers/models/unet_3d_blocks.py", line 2130, in forward
        if self.training and self.gradient_checkpointing and not hidden_states.stop_gradient:  # TODO
    File "/root/project/paddlemix/upgrade_ppdiffuser0240/PaddleMIX/ppdiffusers/ppdiffusers/models/unet_3d_blocks.py", line 2162, in forward
        hidden_states = attn(
    File "/root/project/paddlemix/upgrade_ppdiffuser0240/PaddleMIX/ppdiffusers/ppdiffusers/models/transformer_temporal.py", line 327, in forward
        print("batch_size", batch_size)
        print("time_context.shape[-1]", time_context.shape[-1])
        time_context = time_context_first_timestep[None, :].broadcast_to(
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            [height * width, batch_size, 1, time_context.shape[-1]]
        )

    File "/root/miniconda3/envs/paddle-develop/lib/python3.9/site-packages/paddle/tensor/manipulation.py", line 4146, in broadcast_to
        assert (

    AssertionError: Elements in shape must be 1-D Tensors or integers.var tmp_25 : LOD_TENSOR.shape().dtype(int32).stop_gradient(True)

替换为expand后,问题解决

        time_context = time_context_first_timestep[None, :].expand(
            [height * width, batch_size, 1, time_context.shape[-1]]
        )

总结: paddle动转静貌似不支持broadcast_to算子?

环境信息: paddlenlp 2.6.2 paddlepaddle-gpu 2.6.0 fastdeploy-gpu-python 1.0.7 ppdiffusers 0.24.0

其他补充信息 Additional Supplementary Information

No response

Aurelius84 commented 8 months ago

@westfish 感谢反馈,问题已经复现,定位是之前Zero-Dim功能升级时遗漏了对broadcast_to的处理,我们会尽快修复。从原理和实现来看,broadcast_to与expand很类似,还请先使用expand API来暂时规避此问题,后续修复PR会关联此issue

再次感谢您的反馈。