Open st-tomic opened 4 years ago
Hey! Super weird. Could you provide more details?
I agree :)
The only changes are stated above. Used pipeline in basic.py and readme page. I have also tried it with tfv2.0-GPU and the loss is also decreasing extremely fast within 1st epoch which doesn't seem real.
I only modified audio loading part to use soundfile and used future-fstrings
to support f strings on Python 3.5.
Other than that, nothing is changed from your repo.
The loss is frozen at -0.6921 in the last try.
Data loaded from librispeech:
dataset = asr.dataset.Audio.from_csv('examples/libri-100.csv', batch_size=10)
dev_dataset = asr.dataset.Audio.from_csv('examples/dev-clean.csv', batch_size=10)
test_dataset = asr.dataset.Audio.from_csv('examples/test-clean.csv')
i have a same problem.
i tried training my dataset.
` You do not need to update to CUDA 9.2.88; cherry-picking the ptxas binary is sufficient. 475/476 [============================>.] - ETA: 1s - loss: 2.6582
476/476 [==============================] - 802s 2s/step - loss: 2.6511 - val_loss: -0.6931 Epoch 2/5 119/476 [======>.......................] - ETA: 5:47 - loss: -0.6931 `
After first epoch, val_loss was negative.
and second epoch also had negative loss. the loss does not change, and remains at -0.6931 in second epoch.
i use your environment-gpu.yml for creating a conda environments.
my english skill is bad, but i did my best.
Hi, I had the same problem. For me, it turned out that the pipeline.fit() method returns an empty string instead of a correct transcript, so a model learns to predict it. I used a following code and it works:
dataset =pipeline.wrap_preprocess(dataset, False, None) y = tf.keras.layers.Input(name='y', shape=[None], dtype='int32') loss = pipeline.get_loss() pipeline._model.compile(pipeline._optimizer, loss, target_tensors=[y]) pipeline._model.fit(dataset,epochs=20)
I guess there is no improvement in this regard. Because the system still produces negative values.
I had similar issue even if I just tried example(basic.py) on tf v2.1. I think the negative loss value might be acceptable. https://github.com/keras-team/keras/issues/9369
But when I used the predict to predict the test.csv (same as training file), the output is empty.['']. It doesn't look reasonable now. code in basic.py : pipeline.predict(data)
Epoch 1/5 1/1 [==============================] - 3s 3s/step - loss: 303.5161 1/1 [==============================] - 19s 19s/step - loss: 610.6132 - val_loss: 303.5161 Epoch 2/5 Epoch 1/5 1/1 [==============================] - 1s 1s/step - loss: 61.1079 1/1 [==============================] - 7s 7s/step - loss: 76.0996 - val_loss: 61.1079 Epoch 3/5 Epoch 1/5 1/1 [==============================] - 1s 1s/step - loss: 9.9619 1/1 [==============================] - 7s 7s/step - loss: 4.4410 - val_loss: 9.9619 Epoch 4/5 Epoch 1/5 1/1 [==============================] - 1s 1s/step - loss: 2.4088 1/1 [==============================] - 7s 7s/step - loss: 0.6229 - val_loss: 2.4088 Epoch 5/5 Epoch 1/5 1/1 [==============================] - 1s 1s/step - loss: 0.5115 1/1 [==============================] - 7s 7s/step - loss: -0.3944 - val_loss: 0.5115
Hmm. I think the code should be below in automatic_speech_recognition/pipeline/ctc_pipeline.py dev_dataset = self.wrap_preprocess(dev_dataset, prepared_features, augmentation)
Right? Let me know if I am wrong. Thanks.
Hi,
I am trying to run the sample training on a librispeech clean 100h.
After few hours of training with
batch size=10
the printed loss value becomes negative. It happens in the first epoch.The thing i changed is
read_audio
function to usesoundfile
for reading flac files insted of waves withwavfile.read
. Although both give the same output when reading files so it shouldn't make a difference.Are you familiar with the issue? Loss seems to decrease too fast. Any guess what is going wrong?