persephone-tools / persephone

A tool for automatic phoneme transcription
Apache License 2.0
155 stars 26 forks source link

Feature type fbank_and_pitch throws error #237

Open mirfan899 opened 3 years ago

mirfan899 commented 3 years ago

I've created the settings.ini file to use Kaldi for feature extraction but it seems Persephone not picking up the settings.ini file from the directory.

here is my directory structure.

tree -L 1
.
├── constants.py
├── continuous_training.py
├── exp
├── kids_speech_sample
├── librispeech-lexicon.json
├── main.py
├── preprocess.py
├── __pycache__
├── settings.ini
├── transcribe.py
└── utils.py

Code generating the issue.

from persephone import corpus
from persephone import corpus_reader
from persephone import rnn_ctc

kids_corpus = corpus.Corpus("fbank_and_pitch", "phonemes", "kids_speech_sample")
print(kids_corpus.get_untranscribed_fns())
print(kids_corpus.get_train_fns())
print(kids_corpus.get_test_fns())

kids_corpus = corpus_reader.CorpusReader(kids_corpus, num_train=224, batch_size=16)
# model = rnn_ctc.Model("exp/", kids_corpus, num_layers=3, hidden_size=250)
# 
# model.transcribe(restore_model_path="exp/model/model_best.ckpt")

and error message

WARNING:tensorflow:From /home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/lib/python3.6/site-packages/persephone/model.py:22: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From /home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/lib/python3.6/site-packages/persephone/model.py:27: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

Unhandled exception
Traceback (most recent call last):
  File "/home/irfan/PycharmProjects/Timit_Phone_Recognition/transcribe.py", line 5, in <module>
    kids_corpus = corpus.Corpus("fbank_and_pitch", "phonemes", "kids_speech_sample")
  File "/home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/lib/python3.6/site-packages/persephone/corpus.py", line 200, in __init__
    self.prepare_feats()
  File "/home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/lib/python3.6/site-packages/persephone/corpus.py", line 395, in prepare_feats
    feat_extract.from_dir(self.feat_dir, self.feat_type)
  File "/home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/lib/python3.6/site-packages/persephone/preprocess/feat_extract.py", line 151, in from_dir
    kaldi_pitch(dirname, dirname)
  File "/home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/lib/python3.6/site-packages/persephone/preprocess/feat_extract.py", line 216, in kaldi_pitch
    subprocess.run(args)
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/home/oadams/tools/kaldi/src/featbin/compute-kaldi-pitch-feats': '/home/oadams/tools/kaldi/src/featbin/compute-kaldi-pitch-feats'

my settings.ini file content.

[PATHS]
SOX_PATH = "sox"
FFMPEG_PATH = "ffmpeg"
KALDI_ROOT = "/home/irfan/kaldi"
shuttle1987 commented 3 years ago

It doesn't look like the settings.ini file that you have created is not being used. In response to your comments here I've started a branch to try to try improve the defaults for the Kaldi path: https://github.com/persephone-tools/persephone/pull/238

Which file is the code that you reference in this issue located in?

Can you try modifying your code to import the configuration loading file by adding this line at the top of the imports:

from persephone import config
mirfan899 commented 3 years ago

transcribe.py is the file I'm using to use the pitch feature. Nope importing config does not help.

shuttle1987 commented 3 years ago

It would appear that the settings file is not being found. Can you try editing the config.py file in your site packages to include an absolute path to your configuration file? Knowing what happens in this case will help me determine what the cause of this bug is.

That file will be: /home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/lib/python3.6/site-packages/persephone/config.py

config_file = configparser.ConfigParser()
config_file.read('settings.ini') # Change this to the absolute path to your settings.ini file
mirfan899 commented 3 years ago

Okay, after changing the path, still shows the same error. settings

shuttle1987 commented 3 years ago

This is a bit of a strange bug... I'm not entirely sure what's causing this. Is the code you are working on open source? If so I can try to reproduce the bug if you have a link to the source

mirfan899 commented 3 years ago

Yes, check this https://github.com/mirfan899/Timit_Phoneme_Recognition

mirfan899 commented 3 years ago

Well, this is some strange behavior. After deleting the feat directory it seems settings.ini is being loaded and then throws another error. I already have ffmpeg library installed.

/home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/bin/python /home/irfan/PycharmProjects/Timit_Phone_Recognition/transcribe.py
WARNING:tensorflow:From /home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/lib/python3.6/site-packages/persephone/model.py:22: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From /home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/lib/python3.6/site-packages/persephone/model.py:27: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

Unhandled exception
Traceback (most recent call last):
  File "/home/irfan/PycharmProjects/Timit_Phone_Recognition/transcribe.py", line 7, in <module>
    kids_corpus = corpus.Corpus("fbank_and_pitch", "phonemes", "kids_speech_sample")
  File "/home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/lib/python3.6/site-packages/persephone/corpus.py", line 200, in __init__
    self.prepare_feats()
  File "/home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/lib/python3.6/site-packages/persephone/corpus.py", line 389, in prepare_feats
    feat_extract.convert_wav(path, mono16k_wav_path)
  File "/home/irfan/PycharmProjects/Timit_Phone_Recognition/.tpr/lib/python3.6/site-packages/persephone/preprocess/feat_extract.py", line 186, in convert_wav
    subprocess.run(args)
  File "/usr/lib/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '"ffmpeg"': '"ffmpeg"'

Process finished with exit code 1
mirfan899 commented 3 years ago

Okay, for now, it seems Kaldi pitch feature working after updating the settings.ini file. changing KALDI_PATH to KALDI_ROOT_PATH and deleting the sox and ffmgeg path.

[PATHS]
KALDI_ROOT_PATH = /home/irfan/kaldi
shuttle1987 commented 3 years ago

Ah yes, I missed this, currently to set the root path for Kaldi you must use KALDI_ROOT_PATH and not KALDI_PATH. I see that the docs are not quite accurate here and I'll fix them over in my PR #238

@mirfan899 thanks for this issue, I didn't realize the docs were incorrect here.

@oadams Do you think we should allow the support of KALDI_PATH? I propose that if only one is defined we can use the value that was defined. What should we do in the case of conflict where both KALDI_ROOT_PATH and KALDI_ROOT are simultaneously defined? I'm thinking that throwing an exception if both are supplied are a good idea.

shuttle1987 commented 3 years ago

I think I have fixed this issue and improved the default for Kaldi paths over in PR #238, would be keen to merge that in if the problem is fixed.