[Evaluated] Support of SDP // Issues of test_transformers_xpu.py

🚀 The feature, motivation and pitch

NotImplementedError: Could not run 'aten::_to_copy' with arguments from the 'NestedTensorXPU' backend cases: test_transformers.py::TestTransformersXPU::test_with_nested_tensor_input_xpu
We have no mechanism to handle SDPBackend::ERROR so far. Will give a fully support when we support all SDPBackends. cases: "test_dispatch_fails_no_backend_xpu",
AssertionError: False is not true CPU fallback failure. To support aten::transformer_encoder_layer_forward with proper priority. "test_disable_fastpath_xpu",
1. Double and complex datatype matmul is not supported in oneDNN
  https://github.com/intel/torch-xpu-ops/issues/253
  
  "test_sdp_math_gradcheck_contiguous_inputs_False_xpu", "test_sdp_math_gradcheck_contiguous_inputs_True_xpu", "test_transformerencoder_batch_first_True_training_True_enable_nested_tensor_True_xpu", "test_transformerencoder_batch_first_True_training_True_enable_nested_tensor_False_xpu", "test_transformerencoder_batch_first_True_training_False_enable_nested_tensor_True_xpu", "test_transformerencoder_batch_first_True_training_False_enable_nested_tensor_False_xpu", "test_transformerencoder_batch_first_False_training_True_enable_nested_tensor_True_xpu", "test_transformerencoder_batch_first_False_training_True_enable_nested_tensor_False_xpu", "test_transformerencoder_batch_first_False_training_False_enable_nested_tensor_True_xpu", "test_transformerencoder_batch_first_False_training_False_enable_nested_tensor_False_xpu", "test_scaled_dot_product_attention_4D_input_dim_no_attn_mask_dropout_p_0_5_xpu", "test_scaled_dot_product_attention_4D_input_dim_no_attn_mask_dropout_p_0_2_xpu", "test_scaled_dot_product_attention_4D_input_dim_no_attn_mask_dropout_p_0_0_xpu", "test_scaled_dot_product_attention_4D_input_dim_4D_causal_attn_mask_dropout_p_0_5_xpu", "test_scaled_dot_product_attention_4D_input_dim_4D_causal_attn_mask_dropout_p_0_2_xpu", "test_scaled_dot_product_attention_4D_input_dim_4D_causal_attn_mask_dropout_p_0_0_xpu", "test_scaled_dot_product_attention_4D_input_dim_4D_attn_mask_dropout_p_0_5_xpu", "test_scaled_dot_product_attention_4D_input_dim_4D_attn_mask_dropout_p_0_2_xpu", "test_scaled_dot_product_attention_4D_input_dim_4D_attn_mask_dropout_p_0_0_xpu", "test_scaled_dot_product_attention_4D_input_dim_2D_causal_attn_mask_dropout_p_0_5_xpu", "test_scaled_dot_product_attention_4D_input_dim_2D_causal_attn_mask_dropout_p_0_2_xpu", "test_scaled_dot_product_attention_4D_input_dim_2D_causal_attn_mask_dropout_p_0_0_xpu", "test_scaled_dot_product_attention_4D_input_dim_2D_attn_mask_dropout_p_0_5_xpu", "test_scaled_dot_product_attention_4D_input_dim_2D_attn_mask_dropout_p_0_2_xpu", "test_scaled_dot_product_attention_4D_input_dim_2D_attn_mask_dropout_p_0_0_xpu", "test_scaled_dot_product_attention_3D_input_dim_no_attn_mask_dropout_p_0_5_xpu", "test_scaled_dot_product_attention_3D_input_dim_no_attn_mask_dropout_p_0_2_xpu", "test_scaled_dot_product_attention_3D_input_dim_no_attn_mask_dropout_p_0_0_xpu", "test_scaled_dot_product_attention_3D_input_dim_3D_causal_attn_mask_dropout_p_0_5_xpu", "test_scaled_dot_product_attention_3D_input_dim_3D_causal_attn_mask_dropout_p_0_2_xpu", "test_scaled_dot_product_attention_3D_input_dim_3D_causal_attn_mask_dropout_p_0_0_xpu", "test_scaled_dot_product_attention_3D_input_dim_3D_attn_mask_dropout_p_0_5_xpu", "test_scaled_dot_product_attention_3D_input_dim_3D_attn_mask_dropout_p_0_2_xpu", "test_scaled_dot_product_attention_3D_input_dim_3D_attn_mask_dropout_p_0_0_xpu", "test_scaled_dot_product_attention_3D_input_dim_2D_causal_attn_mask_dropout_p_0_5_xpu", "test_scaled_dot_product_attention_3D_input_dim_2D_causal_attn_mask_dropout_p_0_2_xpu", "test_scaled_dot_product_attention_3D_input_dim_2D_causal_attn_mask_dropout_p_0_0_xpu", "test_scaled_dot_product_attention_3D_input_dim_2D_attn_mask_dropout_p_0_5_xpu", "test_scaled_dot_product_attention_3D_input_dim_2D_attn_mask_dropout_p_0_2_xpu", "test_scaled_dot_product_attention_3D_input_dim_2D_attn_mask_dropout_p_0_0_xpu",

Alternatives

No response

Additional context

No response

intel / torch-xpu-ops

[Evaluated] Support of SDP // Issues of test_transformers_xpu.py #761

🚀 The feature, motivation and pitch

https://github.com/intel/torch-xpu-ops/issues/253

Alternatives

Additional context