yizhilll / MERT

Official implementation of the paper "Acoustic Music Understanding Model with Large-Scale Self-supervised Training".
Apache License 2.0
301 stars 18 forks source link

Questions about data preparation #15

Open nkkanee opened 6 months ago

nkkanee commented 6 months ago

I'm a beginner and I apologize for my lack of knowledge. I am currently preparing the data. Could you please tell me the steps to prepare the data?

First, we created train.tsv using prepare_manifest.py. After that, I used this train.tsv file to refer to HuBERT and created train.km and dict.km.txt.

What should I do then?

Also, how should I handle prepare_codecs_from_manifest.py?

Sorry for my poor writing