espressif / esp-sr

Speech recognition
Other
571 stars 106 forks source link

How to realize Chinese recognition of ESP-SR on ArduinoIDE (AIS-1602) #100

Open Sail-211010 opened 4 months ago

Sail-211010 commented 4 months ago

If I want to achieve Chinese recognition, do I just modify the transliteration data?(Refers to commands rather than wake-up words) And I need to know how this relationship is derived, such as AE1/OW1, etc. image

I will try to change the corresponding syllables of transliteration to try the effect, is it convenient to provide information in this respect, and can you list the conversion results of one or two Chinese characters (give one or two examples so that I can compare whether the results I modified are correct)? Similar to image

feizi commented 4 months ago

Hi @Sail-211010 , Chinese uses pinyin, which is a separate vocabulary and does not need to be converted into phonemes like English. Please refer to documents and examples.

The default speech commands(for MultiNet7) is defined here. You can modify it directly.

Sail-211010 commented 4 months ago

Sorry for forgetting to reply

Another question is, if I want to change the wake word, can I do it without modifying the library file, can I call a function or something

feizi commented 4 months ago

Yes, at first, you need to select multiple wake word models by menuconfig, then load one by the model name in the code, like this: If you want to modify the model, you need to change the model name and reinitialize the afe handle.

    srmodel_list_t *models = esp_srmodel_init("model");
    char *wn_name = NULL;
    char *wn_name_2 = NULL;

    // If you do not know model name, you can filter by wake word name.
    char *alexa_model_name = esp_srmodel_filter(models, ESP_WN_PREFIX, "alexa");
    char *hilexin_model_name = esp_srmodel_filter(models, ESP_WN_PREFIX, "hilexin");

    afe_handle = (esp_afe_sr_iface_t *)&ESP_AFE_SR_HANDLE;
    afe_config_t afe_config = AFE_CONFIG_DEFAULT();
    afe_config.wakenet_model_name = alexa_model_name;  // or other wakenet models

    afe_data = afe_handle->create_from_config(&afe_config);;
Sail-211010 commented 4 months ago

Okay, thanks. I'll test it later

Sail-211010 commented 4 months ago

hello, I had a problem when testing Chinese recognition, I used Chinese pinyin to replace the phonemes of English text, but the device could not recognize Chinese, English can be recognized normally image

Sail-211010 commented 4 months ago

Translating Chinese characters into pinyin is equally unrecognizable image

feizi commented 4 months ago

The Chinese model and the English model are separate. which model do you select?

Sail-211010 commented 4 months ago

I have not modified the model, how do I need to modify it? What kinds of models are currently offered, and what are the focuses of each model

feizi commented 4 months ago

You can select different multinet model by idf.py menuconfig > ESP Speech Recognition. Please refer to the following documents for more details.
https://docs.espressif.com/projects/esp-sr/en/latest/esp32s3/benchmark/README.html#multinet https://docs.espressif.com/projects/esp-sr/en/latest/esp32s3/speech_command_recognition/README.html

Sail-211010 commented 4 months ago

What you said is realized through IDF, and I want to realize Chinese recognition in ArduinoIDE

feizi commented 4 months ago

I am not familiar with ArduinoIDE, but there are two points to note for correct Chinese recognition:

  1. You must load a Chinese multinet into flash, which is selected in esp-df through Kconfig.

    • method1: I think you need to modify "sdkconfig.h" in ArduinoIDE to make sure CONFIG_SR_MN_CN_MULTINET7_QUANT or SR_MN_CN_MULTINET6_QUANT is defined.
    • method2: If KCONFIG cannot be defined in ArduinoIDE, you can use pack_model.py to pack any models into a binary file, and then flash the file to the model partition in partitions.csv. In fact, method 1 also uses pack_model.py in CMakeLists.txt to pack and flash the models.
  2. Select the Chinese model in the code. If mn_name is not NULL, you load it correctly.

    char *mn_name = esp_srmodel_filter(models, ESP_MN_PREFIX, ESP_MN_CHINESE);
Sail-211010 commented 4 months ago

Thank you very much. I will try according to your description. If there is any problem, I will trouble you again

Sail-211010 commented 4 months ago

image Hello, I need to ask you again. How to generate bin files for the model? If I only generate./multinet_model/mn5q8_cn model, the device will prompt "Please select wake words!" But I don't know how to integrate the wakeword with the model

Sail-211010 commented 4 months ago

I generated the model through pack_model.py, and replaced srmodels.bin with the model. "Please select wake words!" appeared.

Sail-211010 commented 4 months ago

I generated the model through pack_model.py, and replaced srmodels.bin with the model. "Please select wake words!" appeared. I think it is necessary to integrate the wakenet_model and multinet_model into one srmodels.bin and then import it