bshall / ZeroSpeech

VQ-VAE for Acoustic Unit Discovery and Voice Conversion
https://bshall.github.io/ZeroSpeech/
318 stars 47 forks source link

keyerror when preprocess data #12

Open liu-x-p opened 4 years ago

liu-x-p commented 4 years ago

I set the directory for data as datasets/2019/english, when I run the script preprocess.py, it raises
keyerror: 'accessing unknown key in a struct: dataset.in_dir' but I can't find how to solve it. Could you help me?

bshall commented 4 years ago

Hi @liu-x-p,

Sure. If you look at the usage in the readme it says:

python preprocess.py in_dir=/path/to/dataset dataset=[2019/english or 2019/surprise]

Note: in_dir must be the path to the 2019 folder...

This is the folder that contains the wav in it's subdirectories. So, for example, if I download the ZeroSpeech 2020 dataset and store it at ~/Documents/ZeroSpeech/2020 the command should be:

python preprocess.py in_dir=~/Documents/ZeroSpeech/2020/2019 dataset=2019/english

If you're still having trouble you please post the command you use and the path to your data directory.

Hope that helps!

liu-x-p commented 4 years ago

@bshall Thank you! I followed your settings for the command python preprocess.py in_dir=/home/omnisky/mount/holiday/ZeroSpeech-0.1/datasets/2020/2019 dataset=2019/english and the path is /home/omnisky/mount/holiday/ZeroSpeech-0.1/datasets/2020/2019, it contains 'english' and 'surprise'.

bshall commented 4 years ago

No problem @liu-x-p. If you're still having issues I'd advise keeping the actual data in a separate folder to this repo. So this repo would be under holiday/ZeroSpeech for example and the actual wav files would be stored in holiday/RawData/2020 for example. Then in_dir should point to .../holiday/RawData/2020/2019.

dummy-arch commented 3 years ago

On following the exact same procedure I am getting an error : hydra.errors.OverrideParseException: LexerNoViableAltException: Passport/VAE/ZeroSpeech/zerospeech_2020/2020/2019. Could you kindly help me out? The directory path to wav files is Passport/VAE/ZeroSpeech/zerospeech_2020/2020/2019 and to the json files is Passport/VAE/ZeroSpeech/zerospeech_2020/datasets/2019/english

ZhengRachel commented 3 years ago

@liu-x-p Hi! I am also a Chinese student trying to run this repo and I am encountering some similar problems as you...TAT I wonder if you have successfully run this repo and could we have a discussion via e-mail... this is my email adress rachelzheng2019@163.com Looking forward to your reply!

liu-x-p commented 3 years ago

@ZhengRachel I'm not sure about this as it has been so long time. As you can see in my question and comment, I got this problem when I downloaded this work as ZeroSpeech-0.1, which I think may be a early version. And I downloaded it again, the ZeroSpeech-master branch, then it worked. I think the command I used to run is python preprocess.py in_dir=../datasets/2020/2019 dataset=2019/english