BakerBunker / VecTok

Official implementation of Vec-Tok Speech
91 stars 5 forks source link

Vec-Tok Speech

This is the official code implementation of paper Vec-Tok Speech: Speech Vectorization and Tokenization for Neural Speech Generation

This project was started as a internal experiment, so the most part of code was depend on internal toolchains and dataset. We are working hard on orgnizing and clean the code, and we will release the cleaned part step by step.

We are also looking for community efforts and resources to reimplement this framework with open-source data and toolchain.

[Demo Page] [Paper]

Overview

We propose a speech codec based on speech vectors and semantic tokens.

Our framework has some nice property:

Theoretically, Vec-Tok can do these tasks in a unified framework:

Overview

Roadmap

Release (train and inference) code and document of

Release pretrained checkpoint of

Citation

@article{vectokspeech,
    author={Xinfa Zhu and Yuanjun Lv and Yi Lei and Tao Li and Wendi He and Hongbin Zhou and Lei Xie},
    title={Vec-Tok Speech: Speech Vectorization and Tokenization for Neural Speech Generation},
    year={2023},
    journal={arXiv preprint arXiv:2310.07246},
}