[torchlib] Implement `aten::_scaled_dot_product_efficient_attention.default`

microsoft / onnxscript

ONNX Script enables developers to naturally author ONNX functions and models using a subset of Python.

https://onnxscript.ai/

MIT License

273 stars 53 forks source link

[torchlib] Implement `aten::_scaled_dot_product_efficient_attention.default` #1160

Closed BowenBao closed 10 months ago

BowenBao commented 10 months ago

From models. This op is emitted from cuda export only.

    cait_m36_384
    deit_base_distilled_patch16_224
    crossvit_9_240
    mobilevit_s
    pit_b_224
    nanogpt
    timm_vision_transformer
    beit_base_patch16_224
    stable_diffusion_unet

From https://github.com/microsoft/onnx-converters-private/issues/196

cc @justinchuby

titaiwangms commented 10 months ago

Does it make any difference on torchlib implementation if the op is meant for CUDA? If not, I can take this real quick.

BowenBao commented 10 months ago

It appears only in cuda export, other than that I don't feel in general there is a difference for torchlib impl.