Closed xinkez closed 2 months ago
Hi @xinkez, thanks for your interest in the project! codec-bpe is not an audio language model but rather a tool for converting multilevel audio codes to/from unicode and preparing tokenizers and datasets from that.
If you are asking about how to load a trained codec-bpe tokenizer, you can use the usual way:
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("path/to/tokenizer")
That being said, I'm working on training some baseline audio LMs using this method and plan to release them when they are done, along with instructions for inference. Unfortunately I can't provide an estimate as to when that will be.
Hi @AbrahamSanders , thanks for your quick response. Your answer helps me. Looking forward to your audio LMs works.
Hi,
How to load the trained model and inference? Thanks in advance.