AV-HuBERT integration with torchaudio.pipelines.Wav2Vec2FABundle

🚀 The feature

How would someone go about configuring AV-HuBERT to work with torchaudio.pipelines.Wav2Vec2FABundle? It currently only supports MMS_FA

Motivation, pitch

Currently the torchaudio.pipelines.Wav2Vec2FABundle forced aligner only supports MMS_FA. This is a request to add support for an AV-ASR, namely AV-HuBERT. The feature could also be a tutorial on how to extend the list of supported models that are multimodal speech+video.

Alternatives

No response

Additional context

No response

pytorch / audio

AV-HuBERT integration with torchaudio.pipelines.Wav2Vec2FABundle #3717

🚀 The feature

Motivation, pitch

Alternatives

Additional context