openspeech-team / openspeech

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.
https://openspeech-team.github.io/openspeech/
MIT License
670 stars 112 forks source link

Fix relative positional multi-head attention layer #182

Closed upskyy closed 1 year ago

upskyy commented 1 year ago

What does this PR do?

Fixes #176

I referred to fairseq's conformer layer multi-head attention. [code] Also, I confirmed that it is training.

  1. math.sqrt(dim) -> math.sqrt(d_head)
  2. Add relative positional encoding module
  3. Fix _relative_shift method
    • input : B X n_head X T X 2T-1
    • output : B X n_head X T X T
  4. Add WandbLogger name

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag