Jamiroquai88 / VBDiarization

Speaker diarization based on Kaldi x-vectors, tuned for 16k microphone data
Apache License 2.0
95 stars 29 forks source link

compute-mfccs-feats #12

Closed jadujoel closed 5 years ago

jadujoel commented 5 years ago

Thanks for sharing your code! I'm trying to get it to run and would appreciate any help!

I run: python diarization.py -c ../configs/vbdiar.yml -l lists/list_spk.scp --audio-dir wav/fisher-english-p1 --vad-dir vad/fisher-english-p1 -m diarization

And get: ValueError: /Users/admin/kaldi/src/featbin/compute-mfcc-feats binary returned error code None.

So I'm using the wavfiles and scp that's included in the github repo.
I've tried extracting the .gz files in /vad/fisher-english-p1, and I've tried downsampling the wav files from 16 to 8 kHz but neither helped. Any Idea what I'm doing wrong?

Full output below: /Users/admin/miniconda3/lib/python3.7/site-packages/sklearn/externals/joblib/init.py:15: DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+. warnings.warn(msg, category=DeprecationWarning) 2019-07-29 23:48:37,730 - main - INFO - Running diarization.py -c ../configs/vbdiar.yml -l lists/list_spk.scp --audio-dir wav/fisher-english-p1 --vad-dir vad/fisher-english-p1 -m diarization. 2019-07-29 23:48:37,732 - main - WARNING - Failed to import libmkl_rt.so, it will not be possible to use mkl backend. /Users/admin/miniconda3/lib/python3.7/site-packages/vbdiar/utils/utils.py:428: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. return yaml.load(ymlfile) 2019-07-29 23:48:37,862 - main - INFO - Processing file fisher-english-p1/fe_03_00001-a. 2019-07-29 23:48:37,867 - vbdiar.kaldi.mfcc_features_extraction - INFO - Extracting MFCC features from /Users/admin/Dropbox/sr/wavread/VBDiarization/VBDiarization/examples/wav/fisher-english-p1/fisher-english-p1/fe_03_00001-a.wav. Traceback (most recent call last): File "diarization.py", line 266, in n_jobs=1) File "diarization.py", line 78, in process_files ret = _process_files((fns, kwargs)) File "diarization.py", line 48, in _process_files ret.append(process_file(file_name=fn, **kwargs)) File "diarization.py", line 116, in process_file features = features_extractor.audio2features(os.path.join(wav_dir, f'{file_name}{wav_suffix}')) File "/Users/admin/miniconda3/lib/python3.7/site-packages/vbdiar/kaldi/mfcc_features_extraction.py", line 90, in audio2features raise ValueError(f'{self.compute_mfcc_feats_bin} binary returned error code ' ValueError: /Users/admin/kaldi/src/featbin/compute-mfcc-feats binary returned error code None. b'/Users/admin/kaldi/src/featbin/apply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 ark:- ark:/var/folders/9t/nb4clh4907j8gb5bsdbp0gdh0000gp/T/tmpuaw5ixfc \nLOG (apply-cmvn-sliding[5.5.433~1-7637d]:main():apply-cmvn-sliding.cc:75) Applied sliding-window cepstral mean normalization to 0 utterances, 0 had errors.\n'

Jamiroquai88 commented 5 years ago

Hi, can you post here your mfcc kaldi config? Sample rate in config must match the sample rate of audios. Can you please also run soxi on one of the example audios? If none of this helps, try to run kaldi command from the command line (this python code is basically just a wrapper for kaldi).

jadujoel commented 5 years ago

Thanks for the quick response! I'm using the included "mfcc.config" referenced from "vbdiar.yml"

vbdiar.yml: MFCC: config_path: ../configs/mfcc.conf apply_cmvn_sliding: True norm_vars: False center: True cmn_window: 300

mfcc.config: --sample-frequency=16000 --frame-length=25 # the default is 25 --low-freq=20 # the default. --high-freq=7700 # the default is zero meaning use the Nyquist (4k in this case). --num-ceps=23 # higher than the default which is 12. --snip-edges=false

And soxi gives me: Input File : '/Users/admin/Dropbox/sr/wavread/VBDiarization/VBDiarization/examples/wav/fisher-english-p1/fe_03_00001-a.wav' Channels : 1 Sample Rate : 16000 Precision : 16-bit Duration : 00:00:07.00 = 112000 samples ~ 525 CDDA sectors File Size : 224k Bit Rate : 256k Sample Encoding: 16-bit Signed Integer PCM

I will try to run kaldi from command line!

Jamiroquai88 commented 5 years ago

configs and also audio format looks fine, try to print and run command directly from command line

iamlinxifan commented 5 years ago

Hi, I meet the same problem as @jadujoel ,and I am not sure the command I used is right or not. python diarization.py -l lists/AMI_dev-eval.scp -c ../configs/vbdiar.yml -m diarization --audio-dir ./wav/4133/ --vad-dir vad_dir What files should I put in the vad-dir? My aim is to diarize a phone record. I'm trying to get it to run and would appreciate any help!

Jamiroquai88 commented 5 years ago

Hi, I was able to replicate the issue, I am gonna fix that.

iamlinxifan commented 5 years ago

Thanks for the quick response!And I would like to konw the performance of speaker diarization in this project.

Jamiroquai88 commented 5 years ago

Problem should be fixed. Performance of diarization is shown in README on AMI. @jadujoel please let me know if it fixed the problem so I can close the issue. I fixed also run.sh script in examples directory, so you can run directly that one.

iamlinxifan commented 5 years ago

I will try as soon as possible.Thanks for the your response!

iamlinxifan commented 5 years ago

I have update the examples/diarization.py and the run.sh, and I get a error like this(and how to generate embeddings or what should be put in the --in-emb-dir ): Traceback (most recent call last): File "diarization.py", line 258, in raise ValueError('At least one of --in-emb-dir or --out-emb-dir must be specified.') ValueError: At least one of --in-emb-dir or --out-emb-dir must be specified.

Jamiroquai88 commented 5 years ago

You should run run.sh script.

iamlinxifan commented 5 years ago

Yes, I use command 'sh run.sh', and it shows the error.

jadujoel commented 5 years ago

Wonderful it works now after pulling your update, thanks! I needed to make a folder (i called it "embout") in examples dir, then added " --out-emb-dir embout " to run.sh and after that all good!

jadujoel commented 5 years ago

Also i added diarization.py at line 152

    from sys import platform
    if platform == "darwin":
        mkl_rt = ctypes.CDLL('libmkl_rt.dylib')
    else:
        mkl_rt = ctypes.CDLL('libmkl_rt.so')
Jamiroquai88 commented 5 years ago

@jadujoel I forgot to fix --out-emb-dir parameter in run.sh, now it should be fine

mkl_rt = ctypes.CDLL('libmkl_rt.dylib')

is it MAC specific?

jadujoel commented 5 years ago

well when you install the mkl library either directly from intel, or from pip: no .so files are created on my mac atleast, but the .dylib file with the same name seems to do the same thing so my guess is its the same thing different extension.

Jamiroquai88 commented 5 years ago

in the end, MKL should not be needed at all, so I will try to remove this code in the future closing the issue

kejin-qian commented 4 years ago

Hi, I have the same issue "ValueError: /Users/admin/kaldi/src/featbin/compute-mfcc-feats binary returned error code None." with the current version. Tried to run run.sh directly and also tried using my own data. In both cases, I got the same value error.