Refactor layers for CLIP text encoder of SD model

Refactor layers for CLIP text encoder of SD model. Tested to successfully generate a proper image as before refactoring.

There are several updates in this PR.

Clean the split UNet implementation in SD
Refactor CLIP to use the layers module
~~Add GATED_SHARED type in FeedForwardType, which is used in CLIP.~~
Add qkv_transpose_before_split filed in AttentionConfig, which is used in CLIP.
Add GELU_QUICK type in ActivationType, which is used in CLIP.
Add attn_fused_qkv_proj in TensorNames, which bundle qkv projection tensors in one tensor, used in CLIP.
Add embedding_position in TensorNames, which is a learned position embedding, used in CLIP.

BUG=b/311216181

google-ai-edge / ai-edge-torch