RasaHQ / rasa

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
https://rasa.com/docs/rasa/
Apache License 2.0
18.59k stars 4.59k forks source link

Rasa 2.0.0rc2 did not report in UTF-8 format #6844

Closed duongkstn closed 2 years ago

duongkstn commented 3 years ago

Rasa version: 2.0.0rc2

Rasa SDK version (if used & relevant):

Rasa X version (if used & relevant):

Python version: 3.7.9

Operating system (windows, osx, ...): Ubuntu 18.04 LTS

Issue: Rasa 2.0.0 did not report in UTF-8 format (Vietnamese alphabets). This did not happen in rasa 1.10.0

Error (including full traceback): File intent_report.json returned (rasa 2.0.0):

"Xin_ch\u00e0o": { "precision": 0.92, "recall": 0.9583333333333334, "f1-score": 0.9387755102040817, "support": 24, "confused_with":

{ "C\u1ea3m_\u01a1n_KH": 1 }

},

I expect "Xin_ch\u00e0o" should be "Xinchào", "C\u1ea3m\u01a1n_KH" should be "Cảm_ơn_KH" in Vietnamese. The same error appeared in intent_errors.json too 👎 Command or request that led to error:

rasa test nlu

Content of configuration file (config.yml) (if relevant):

Content of domain file (domain.yml) (if relevant):

sara-tagger commented 3 years ago

Exalate commented:

sara-tagger commented:

Thanks for the issue, @degiz will get back to you about it soon!

You may find help in the docs and the forum, too
🤗
degiz commented 3 years ago

Exalate commented:

degiz commented:

Hey @duongkstn

Could you please attach the original file with NLU data you're using?

duongkstn commented 3 years ago

Exalate commented:

duongkstn commented:

Hi @degiz , you can reproduce my bug by using this simple lines as training_data.md and test.md:

    1. intent:Cảm_ơn
    1. intent:Giờ_làm_việc

And this one as config.yml file:

language: vi pipeline:

Rasa2.0.0rc2 returns intent_report.json file as follow:

{ "C\u1ea3m_\u01a1n": { "precision": 1.0, "recall": 1.0, "f1-score": 1.0, "support": 3, "confused_with": {} }, "Gi\u1edd_l\u00e0m_vi\u1ec7c": { "precision": 1.0, "recall": 1.0, "f1-score": 1.0, "support": 1, "confused_with": {} }, "accuracy": 1.0, "macro avg":

{ "precision": 1.0, "recall": 1.0, "f1-score": 1.0, "support": 4 }

, "weighted avg":

{ "precision": 1.0, "recall": 1.0, "f1-score": 1.0, "support": 4 }

}

duongkstn commented 3 years ago

Exalate commented:

duongkstn commented:

hey @degiz

degiz commented 3 years ago

Exalate commented:

degiz commented:

I can reproduce that @duongkstn

Thanks a lot for finding that! Would you consider fixing this issue? Otherwise I'd add it to the backlog and we'll eventually fix it.

stale[bot] commented 3 years ago

Exalate commented:

stale[bot] commented:

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

longnguyenQB commented 3 years ago

Exalate commented:

longnguyenQB commented:

I am also getting this error, do you have a fix?

stale[bot] commented 2 years ago

Exalate commented:

stale[bot] commented:

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.