AIWintermuteAI / Speech-to-Intent-Micro

An open-source, easily accessible package for training and deploying Speech-to-Intent models on microcontrollers and SBCs
Apache License 2.0
36 stars 7 forks source link

accuracy with Fluent speech commands dataset and training problems #2

Closed wwyl2000 closed 1 year ago

wwyl2000 commented 1 year ago

Hi,

After i trained the models, i got following results: Epoch 00010: val_loss did not improve from 1.00256 722/722 [==============================] - 350s 484ms/step - loss: 1.1604 - intent_output_loss: 0.6473 - slot_output_loss: 0.5131 - intent_output_accuracy: 0.7385 - slot_output_accuracy: 0.8250 - val_loss: 1.0248 - val_intent_output_loss: 0.5918 - val_slot_output_loss: 0.4330 - val_intent_output_accuracy: 0.7761 - val_slot_output_accuracy: 0.8586 - lr: 0.0010

Seems there is some distance between the results you reported: The reference plain Convolutional 2D model trained on FLUENT Speech commands dataset achieves 87.5 % intent and slot accuracy

Could you please tell me how can i get the similar results as you obtained?

Also, at the end of training, I got following problem:

Epoch 00010: val_loss did not improve from 1.00256 722/722 [==============================] - 350s 484ms/step - loss: 1.1604 - intent_output_loss: 0.6473 - slot_output_loss: 0.5131 - intent_output_accuracy: 0.7385 - slot_output_accuracy: 0.8250 - val_loss: 1.0248 - val_intent_output_loss: 0.5918 - val_slot_output_loss: 0.4330 - val_intent_output_accuracy: 0.7761 - val_slot_output_accuracy: 0.8586 - lr: 0.0010 {'sampling_rate': 16000, 'min_freq': 100, 'max_freq': 8000, 'win_size_ms': 0.02, 'win_increase_ms': 0.02, 'num_cepstral': 10} 2023-09-29 21:05:17.843966: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them. INFO:tensorflow:Assets written to: /tmp/tmpqgc5w035/assets INFO:tensorflow:Assets written to: /tmp/tmpqgc5w035/assets 2023-09-29 21:05:20.793077: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:363] Ignored output_format. 2023-09-29 21:05:20.793106: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:366] Ignored drop_control_dependency. 2023-09-29 21:05:20.794313: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/tmpqgc5w035 2023-09-29 21:05:20.800583: I tensorflow/cc/saved_model/reader.cc:107] Reading meta graph with tags { serve } 2023-09-29 21:05:20.800610: I tensorflow/cc/saved_model/reader.cc:148] Reading SavedModel debug info (if present) from: /tmp/tmpqgc5w035 2023-09-29 21:05:20.822849: I tensorflow/cc/saved_model/loader.cc:228] Restoring SavedModel bundle. 2023-09-29 21:05:20.936250: I tensorflow/cc/saved_model/loader.cc:212] Running initialization op on SavedModel bundle at path: /tmp/tmpqgc5w035 2023-09-29 21:05:20.974319: I tensorflow/cc/saved_model/loader.cc:301] SavedModel load for tags { serve }; Status: success: OK. Took 180013 microseconds. 2023-09-29 21:05:21.047289: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:237] disabling MLIR crash reproducer, set env var MLIR_CRASH_REPRODUCER_DIRECTORY to enable. 2023-09-29 21:05:21.169845: I tensorflow/compiler/mlir/lite/flatbuffer_export.cc:1962] Estimated count of arithmetic ops: 5.647 M ops, equivalently 2.823 M MACs

Estimated count of arithmetic ops: 5.647 M ops, equivalently 2.823 M MACs Traceback (most recent call last): File "./train.py", line 233, in main(args) File "./train.py", line 145, in main tflite_filename = tflite_convert(model, model_name, calibration_generator) File "/home/xyz/Speech-to-Intent-Micro/models.py", line 202, in tflite_convert tflite_quant_model = converter.convert() File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 775, in wrapper return self._convert_and_export_metrics(convert_func, *args, kwargs) File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 761, in _convert_and_export_metrics result = convert_func(self, *args, *kwargs) File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 1170, in convert saved_model_convert_result = self._convert_as_saved_model() File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 1153, in _convert_as_saved_model self).convert(graph_def, input_tensors, output_tensors) File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 952, in convert result, self._quant_mode, quant_io=self.experimental_new_quantizer) File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/convert_phase.py", line 226, in wrapper raise error from None # Re-throws the exception. File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/convert_phase.py", line 216, in wrapper return func(args, kwargs) File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 722, in _optimize_tflite_model model, q_in_type, q_out_type, q_activations_type, q_allow_float) File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 530, in _quantize self.representative_dataset.input_gen) File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/convert_phase.py", line 226, in wrapper raise error from None # Re-throws the exception. File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/convert_phase.py", line 216, in wrapper return func(*args, **kwargs) File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 228, in calibrate self._feed_tensors(dataset_gen, resize_input=True) File "/home/my_env/lib/python3.7/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 97, in _feed_tensors for sample in dataset_gen(): File "/home/xyz/Speech-to-Intent-Micro/models.py", line 189, in representative_dataset yield [X.astype(np.float32)] AttributeError: 'tuple' object has no attribute 'astype'

Could you please help with this issue? Thanks, Willy

AIWintermuteAI commented 1 year ago

Hi! I see that default training epochs argument in Python script is 10, however if you look at the notebook (from where accuracy numbers were quoted) there the models are trained for 30 epochs. https://github.com/AIWintermuteAI/Speech-to-Intent-Micro/blob/main/jupyter_notebooks/speech_to_intent_tf_keras_edited.ipynb When running this project, consider the Jupyter Notebook the source of truth, since I made most of my experiments in it - the Python script was created later out of convenience. Let me know if it solved the issue!

wwyl2000 commented 1 year ago

Hi,

Thank you very much for your reply!

Question 1: number of epochs and different results: I have been using the train.py on a Linux server which I need to use bsub to submit my jobs. Sorry I am not familiar to Jupyter Notebook. Can I use it on the Linux server and submit the job using bsub?

I actually changed epoch to 20 and obtained the following results: Epoch 20/20 721/722 [============================>.] - ETA: 0s - loss: 0.8040 - intent_output_loss: 0.4505 - slot_output_loss: 0.3535 - intent_output_accuracy: 0.8282 - slot_output_accuracy: 0.8811
Epoch 00020: val_loss improved from 0.71828 to 0.68566, saving model to checkpoints/2023-09-30_01-58-05/slu_model.h5 722/722 [==============================] - 380s 525ms/step - loss: 0.8041 - intent_output_loss: 0.4507 - slot_output_loss: 0.3534 - intent_output_accuracy: 0.8280 - slot_output_accuracy: 0.8812 - val_loss: 0.6857 - val_intent_output_loss: 0.3991 - val_slot_output_loss: 0.2865 - val_intent_output_accuracy: 0.8566 - val_slot_output_accuracy: 0.9101 - lr: 0.0010 ---------------------------------------Completed fit! {'sampling_rate': 16000, 'min_freq': 100, 'max_freq': 8000, 'win_size_ms': 0.02, 'win_increase_ms': 0.02, 'num_cepstral': 10}

I will try to set epoch to 30 and train the models.

Question 2: Error at the end of training In my previous message, i reported another issue: File "/home/xyz/Speech-to-Intent-Micro/models.py", line 189, in representative_dataset yield [X.astype(np.float32)] AttributeError: 'tuple' object has no attribute 'astype'

Question 3: accuracy of intent and accuracy of slot I saw your package separate slot and intent accuracies. Generally, when we talk about the intent recognition accuracy, should we consider both intent and slot? Should the 2 accuracies numbers be combined?

Thanks again for your help!

Willy

AIWintermuteAI commented 1 year ago

Question 1 and 3: I see you are getting val_intent_output_accuracy: 0.8566 - val_slot_output_accuracy: 0.9101, which is pretty close to what I achieved ((0.86+0.91)/2=0.885 or 88.5%). In a test cell in a Jupyter Notebook I do calculate combined accuracy. You can check it for reference, but it is a simple average of two, just like I calculated manually above. Question 2: Yes, there was a small bug indeed. This is not really a production tested code. Please pull the latest main branch, it was fixed.

AIWintermuteAI commented 1 year ago

I'm closing the issue as resolved.