facebookincubator / AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Apache License 2.0
4.55k stars 369 forks source link

Failed to compile the controlnet: thepowefuldeez/sd21-controlnet-canny #972

Open chengdianxuezi opened 11 months ago

chengdianxuezi commented 11 months ago

I change controlnet args as follow:

class ControlNetModel(nn.Module): _supports_gradient_checkpointing = True

def __init__(
    self,
    in_channels: int = 4,
    flip_sin_to_cos: bool = True,
    freq_shift: int = 0,
    down_block_types: Tuple[str] = (
        "CrossAttnDownBlock2D",
        "CrossAttnDownBlock2D",
        "CrossAttnDownBlock2D",
        "DownBlock2D",
    ),
    block_out_channels: Tuple[int] = (320, 640, 1280, 1280),
    layers_per_block: int = 2,
    downsample_padding: int = 1,
    mid_block_scale_factor: float = 1,
    act_fn: str = "silu",
    norm_num_groups: Optional[int] = 32,
    norm_eps: float = 1e-5,
    cross_attention_dim: int = 1024,
    attention_head_dim: Union[int, Tuple[int]] = (5,10,20,20),
    use_linear_projection: bool = True,
    upcast_attention: bool = True,
    resnet_time_scale_shift: str = "default",
    controlnet_conditioning_channel_order: str = "rgb",
    conditioning_embedding_out_channels: Optional[Tuple[int]] = (16, 32, 96, 256),
    global_pool_conditions: bool = False,
)

But failed to compile, the error info:

RuntimeError: A/B shape mismatch! A: [{ 'depth': 0, 'name': 'batch_size', 'nop': False, 'symbolic_value': batch_size, 'values': [1, 8]}, { 'depth': 0, 'name': 'embedding_size', 'nop': False, 'symbolic_value': embedding_size, 'values': [77, 462]}, {'depth': 0, 'name': None, 'nop': False, 'symbolic_value': 768, 'values': [768]}], B: [{'depth': 0, 'name': None, 'nop': False, 'symbolic_value': 320, 'values': [320]}, { 'depth': 0, 'name': None, 'nop': False, 'symbolic_value': 1024, 'values': [1024]}]

hlky commented 11 months ago

compile_controlnet.py is using lllyasviel/sd-controlnet-canny. hidden_dim is not passed to compile_controlnet. This results in dimension mismatch for text_embeddings as the config is edited for v2.