HomebrewNLP / Olmax

HomebrewNLP in JAX flavour for maintable TPU-Training
BSD 2-Clause "Simplified" License
45 stars 6 forks source link

Audio Modelling #9

Open ClashLuke opened 2 years ago

ClashLuke commented 2 years ago

There are multiple ways we could go about modelling audio. For example, we could tokenise sounds or audio snippets and autoregressively predict the next token. Whether the audio tokens come from a VQGAN or discrete Fourier transformation doesn't matter to the model but could change the performance of our generation a lot. This issue is about finding out how to model sound and develop an end-to-end pipeline to develop a prototype and see how it works.