Closed tumusudheer closed 4 years ago
For support and discussions, please use our Discourse forums.
If you've found a bug, or have a feature request, then please create an issue with the following information:
* **Have I written custom code (as opposed to running examples on an unmodified clone of the repository)**: NO * **OS Platform and Distribution (e.g., Linux Ubuntu 16.04)**: Ubuntu 18.04 * **TensorFlow installed from (our builds, or upstream TensorFlow)**: Upstream tensorflow r1.15.3 (with GPU) * **TensorFlow version (use command below)**: tensorflow r1.15.3 (with GPU) * **Python version**: Python3 * **Bazel version (if compiling from source)**: tensorflow compiled from sources bazel version 0.26.1 * **GCC/Compiler version (if compiling from source)**: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 * **CUDA/cuDNN version**: CUDA 10.2 / CUDNN v7.6.5 * **GPU model and memory**: NVIDIA 1080 Ti (11 GB) * **Exact command to reproduce**:
python3 DeepSpeech.py \ --n_hidden 2048 \ --drop_source_layers 1 \ --alphabet_config_path data/new_alphabet.txt \ --save_checkpoint_dir /data/Self/test/DeepSpeech/train_3/ \ --load_checkpoint_dir /data/Self/test/DeepSpeech/checkpoint/ \ --train_files data/clips/train.csv \ --dev_files data/clips/dev.csv \ --test_files data/clips/test.csv \ --learning_rate 0.000005 \ --use_allow_growth true \ --train_cudnn \ --epochs 20 \ --export_dir /data/Self/test/DeepSpeech/train_3/ \ --summary_dir /data/Self/test/DeepSpeech/train_3/summary \ --train_batch_size 32 \ --dev_batch_size 32 \ --test_batch_size 32 \ --export_batch_size 1 \ --dropout_rate=0.30
I'm using DeepSpeech version 0.7.3.
I wanted to train deepspeech model with my own domain specific data. So I want to add new vocabulary such as integers, (double quote) " and period(.) to the existing deepspeech alphabets given here .
As a first step before using my own data, I wanted to do transfer learning with the common voice data and new alphabets from existing published checkpoint give here. The intention is to use this new checkpoint to start training with my own data (since it has new alphabets) rather than using the deepspeech published checkpoint (since it has limited alphabets/vocabulary).
My new alphabets are here
# Each line in this file represents the Unicode codepoint (UTF-8 encoded) # associated with a numeric label. # A line that starts with # is a comment. You can escape it with \# if you wish # to use '#' as a label. a b c d e f g h i j k l m n o p q r s t u v w x y z ' 0 1 2 3 4 5 6 7 8 9 . " # The last (non-comment) line needs to end with a newline.
When I used the command
python3 DeepSpeech.py \ --n_hidden 2048 \ --drop_source_layers 1 \ --alphabet_config_path data/new_alphabet.txt \ --save_checkpoint_dir /data/Self/test/DeepSpeech/train_3/ \ --load_checkpoint_dir /data/Self/test/DeepSpeech/checkpoint/ \ --train_files data/clips/train.csv \ --dev_files data/clips/dev.csv \ --test_files data/clips/test.csv \ --learning_rate 0.000005 \ --use_allow_growth true \ --train_cudnn \ --epochs 20 \ --export_dir /data/Self/test/DeepSpeech/train_3/ \ --summary_dir /data/Self/test/DeepSpeech/train_3/summary \ --train_batch_size 32 \ --dev_batch_size 32 \ --test_batch_size 32 \ --export_batch_size 1 \ --dropout_rate=0.30
My training loss is starting from lower value and increasing as the epoch is under progress.
Output from terminal is as follows:
Epoch 0 | Training | Elapsed Time: 0:48:14 | Steps: 7186 | Loss: 51.725979 Epoch 0 | Validation | Elapsed Time: 0:01:25 | Steps: 475 | Loss: 44.295110 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv I Saved new best validating model with loss 44.295110 to: /data/Self/test/DeepSpeech/train_3/best_dev-739708 -------------------------------------------------------------------------------- Epoch 1 | Training | Elapsed Time: 0:48:03 | Steps: 7186 | Loss: 33.749292 Epoch 1 | Validation | Elapsed Time: 0:01:25 | Steps: 475 | Loss: 41.206871 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv I Saved new best validating model with loss 41.206871 to: /data/Self/test/DeepSpeech/train_3/best_dev-746894 -------------------------------------------------------------------------------- Epoch 2 | Training | Elapsed Time: 0:48:01 | Steps: 7186 | Loss: 31.213234 Epoch 2 | Validation | Elapsed Time: 0:01:24 | Steps: 475 | Loss: 39.810643 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv I Saved new best validating model with loss 39.810643 to: /data/Self/test/DeepSpeech/train_3/best_dev-754080 -------------------------------------------------------------------------------- Epoch 3 | Training | Elapsed Time: 0:48:03 | Steps: 7186 | Loss: 29.791398 Epoch 3 | Validation | Elapsed Time: 0:01:24 | Steps: 475 | Loss: 39.136365 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv I Saved new best validating model with loss 39.136365 to: /data/Self/test/DeepSpeech/train_3/best_dev-761266 -------------------------------------------------------------------------------- Epoch 4 | Training | Elapsed Time: 0:48:02 | Steps: 7186 | Loss: 28.845716 Epoch 4 | Validation | Elapsed Time: 0:01:25 | Steps: 475 | Loss: 38.489472 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv I Saved new best validating model with loss 38.489472 to: /data/Self/test/DeepSpeech/train_3/best_dev-768452 -------------------------------------------------------------------------------- Epoch 5 | Training | Elapsed Time: 0:48:02 | Steps: 7186 | Loss: 28.051135 Epoch 5 | Validation | Elapsed Time: 0:01:25 | Steps: 475 | Loss: 37.851685 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv I Saved new best validating model with loss 37.851685 to: /data/Self/test/DeepSpeech/train_3/best_dev-775638 -------------------------------------------------------------------------------- Epoch 6 | Training | Elapsed Time: 0:48:02 | Steps: 7186 | Loss: 27.403971 Epoch 6 | Validation | Elapsed Time: 0:01:25 | Steps: 475 | Loss: 37.467827 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv I Saved new best validating model with loss 37.467827 to: /data/Self/test/DeepSpeech/train_3/best_dev-782824 -------------------------------------------------------------------------------- Epoch 7 | Training | Elapsed Time: 0:48:02 | Steps: 7186 | Loss: 26.854938 Epoch 7 | Validation | Elapsed Time: 0:01:25 | Steps: 475 | Loss: 37.366411 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv I Saved new best validating model with loss 37.366411 to: /data/Self/test/DeepSpeech/train_3/best_dev-790010 -------------------------------------------------------------------------------- Epoch 8 | Training | Elapsed Time: 0:48:02 | Steps: 7186 | Loss: 26.361723 Epoch 8 | Validation | Elapsed Time: 0:01:25 | Steps: 475 | Loss: 37.046090 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv I Saved new best validating model with loss 37.046090 to: /data/Self/test/DeepSpeech/train_3/best_dev-797196 -------------------------------------------------------------------------------- Epoch 9 | Training | Elapsed Time: 0:48:02 | Steps: 7186 | Loss: 25.926279 Epoch 9 | Validation | Elapsed Time: 0:01:25 | Steps: 475 | Loss: 36.745385 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv I Saved new best validating model with loss 36.745385 to: /data/Self/test/DeepSpeech/train_3/best_dev-804382 -------------------------------------------------------------------------------- Epoch 10 | Training | Elapsed Time: 0:48:08 | Steps: 7186 | Loss: 25.514988 Epoch 10 | Validation | Elapsed Time: 0:01:25 | Steps: 475 | Loss: 36.442261 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv I Saved new best validating model with loss 36.442261 to: /data/Self/test/DeepSpeech/train_3/best_dev-811568 -------------------------------------------------------------------------------- Epoch 11 | Training | Elapsed Time: 0:48:07 | Steps: 7186 | Loss: 25.151161 Epoch 11 | Validation | Elapsed Time: 0:01:24 | Steps: 475 | Loss: 36.470721 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv -------------------------------------------------------------------------------- Epoch 12 | Training | Elapsed Time: 0:48:01 | Steps: 7186 | Loss: 24.817111 Epoch 12 | Validation | Elapsed Time: 0:01:24 | Steps: 475 | Loss: 36.185578 | Dataset: /home/tumu/Self/Research/Work/tensorflow_work/models/try/rnnt-speech-recognition/data/clips/dev.csv I Saved new best validating model with loss 36.185578 to: /data/Self/test/DeepSpeech/train_3/best_dev-825940
I'm not getting good results with my new checkpoints (as I did the same exercise for 3 epochs and verified some results with Common speech audio files in train.tsv), seems I'm doing something wrong.
1. When I add new alphabets, do I need to retrain the scorer as well ? My new alphabets are mostly (a-z, 0-9 and . , " , comma (,), semi-colon ) (few of like these) ? 2. Do I need to change my training parameters to as may be I'm doing something wrong. 3. I followed the procedure mentioned [here](https://deepspeech.readthedocs.io/en/v0.7.3/TRAINING.html#transfer-learning-new-alphabet). Also prepared the data using the command `bin/import_cv2.py --filter_alphabet data/new_alphabet.txt <path_to_common_speech_tsv_files>` 4. Do I need to perform any additional steps or drop more layers to train the model with new alphabets ( new vocabulary) ? Also I would like to start from the deeplearning provided checkpoint rather than from scratch.
Please let me know how to proceed with training a model with new alphabets. Thank you
This is obviously not a bug but a support request. Please use Discourse, as explained in the issue template.
Hi Just tried to post in the support Discourse support channel but my account is put on hold.
Hi Just tried to post in the support Discourse support channel but my account is put on hold.
What do you mean, "on hold"?
This is the message I got when I tried to sign up in the discussion board:
Account temporarily on hold
system Robot Overlord 20m Hello,
This is an automated message from Mozilla Discourse to let you know that your account has been temporarily placed on hold as a precautionary measure.
Please do continue to browse, but you won’t be able to reply or create topics until a staff member 1 reviews your most recent posts. We apologize for the inconvenience.
For additional guidance, refer to our community guidelines.
My username there : tumusudheer
This is the message I got when I tried to sign up in the discussion board:
Account temporarily on hold
system Robot Overlord 20m Hello,
This is an automated message from Mozilla Discourse to let you know that your account has been temporarily placed on hold as a precautionary measure.
Please do continue to browse, but you won’t be able to reply or create topics until a staff member 1 reviews your most recent posts. We apologize for the inconvenience.
For additional guidance, refer to our community guidelines.
Well this is the antispam feature, please be patient.
Sure. I just received the following message:
Hey there. We see you’ve been busy reading, which is fantastic, so we’ve promoted you up a trust level!
We’re really glad you’re spending time with us and we’d love to know more about you. Take a moment to fill out your profile, or feel free to start a new topic.
Then after this, I'm about to submit (+ create a new topic button) this question as a new topic. and when I'm submitting, I'm getting the following error:
You are not permitted to view the requested resource.
Sure. I just received the following message:
Hey there. We see you’ve been busy reading, which is fantastic, so we’ve promoted you up a trust level! We’re really glad you’re spending time with us and we’d love to know more about you. Take a moment to fill out your profile, or feel free to start a new topic.
Then after this, I'm about to submit (+ create a new topic button) this question as a new topic. and when I'm submitting, I'm getting the following error:
You are not permitted to view the requested resource.
Should be fine now, looks like big post from copy/paste triggers spam protection. Thanks for your patience.
Thanks @lissyx , I posted my question here. Thank you
For support and discussions, please use our Discourse forums.
If you've found a bug, or have a feature request, then please create an issue with the following information:
I'm using DeepSpeech version 0.7.3.
I wanted to train deepspeech model with my own domain specific data. So I want to add new vocabulary such as integers, (double quote) " and period(.) to the existing deepspeech alphabets given here .
As a first step before using my own data, I wanted to do transfer learning with the common voice data and new alphabets from existing published checkpoint give here. The intention is to use this new checkpoint to start training with my own data (since it has new alphabets) rather than using the deepspeech published checkpoint (since it has limited alphabets/vocabulary).
My new alphabets are here
When I used the command
My training loss is starting from lower value and increasing as the epoch is under progress.
Output from terminal is as follows:
I'm not getting good results with my new checkpoints (as I did the same exercise for 3 epochs and verified some results with Common speech audio files in train.tsv), seems I'm doing something wrong.
When I add new alphabets, do I need to retrain the scorer as well ? My new alphabets are mostly (a-z, 0-9 and . , " , comma (,), semi-colon ) (few of like these) ?
Do I need to change my training parameters to as may be I'm doing something wrong.
I followed the procedure mentioned here. Also prepared the data using the command
bin/import_cv2.py --filter_alphabet data/new_alphabet.txt <path_to_common_speech_tsv_files>
Do I need to perform any additional steps or drop more layers to train the model with new alphabets ( new vocabulary) ? Also I would like to start from the deeplearning provided checkpoint rather than from scratch.
Please let me know how to proceed with training a model with new alphabets. Thank you