PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
https://paddlespeech.readthedocs.io
Apache License 2.0
11.13k stars 1.85k forks source link

[TTS] C++ 前端崩溃测试用例 #3035

Open SwimmingTiger opened 1 year ago

SwimmingTiger commented 1 year ago

记录一些导致C++前端崩溃的示例文本:

  1. 包含字母,断言异常。
./run_front_demo.sh --sentence 'TTS语音合成服务'
tts_front_demo: /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:358: int ppspeech::FrontEngineInterface::GetInitialsFinals(const string&, std::vector<std::__cxx11::basic_string<char> >&, std::vector<std::__cxx11::basic_string<char> >&): Assertion `word_finals.size() == ppspeech::utf8string2wstring(word).length() && word_finals.size() == word_initials.size()' failed.
  1. 字母在结尾,内存越界。
./run_front_demo.sh --sentence '语音合成服务a'

# 调试
gdb ./build/tts_front_demo -ex "set args --sentence '语音合成服务a'"
Program received signal SIGSEGV, Segmentation fault.
(gdb) bt
#0  __memcpy_generic () at ../sysdeps/aarch64/multiarch/../memcpy.S:100
#1  0x0000007ff7fa8db0 in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) () from /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/third-party/build/lib/libgflags.so.2.2
#2  0x000000555558c500 in __gnu_cxx::new_allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::construct<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&> (this=0x7fffffeb18, __p=0x555ac91430) at /usr/include/c++/9/ext/new_allocator.h:146                                                                                                                                                                            
#3  0x0000005555584ecc in std::allocator_traits<std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::construct<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&> (__a=..., __p=0x555ac91430) at /usr/include/c++/9/bits/alloc_traits.h:483                                                                                                                                                                        
#4  0x000000555557f010 in std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::push_back (this=0x7fffffeb18, 
    __x=<error: Cannot access memory at address 0xff1f0000ff1b>) at /usr/include/c++/9/bits/stl_vector.h:1189
#5  0x000000555556dd48 in ppspeech::FrontEngineInterface::GetInitialsFinals (this=0x5555638350, word="a", word_initials=std::vector of length 1, capacity 1 = {...}, word_finals=std::vector of length 0, capacity 2)
    at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:349
#6  0x000000555556df58 in ppspeech::FrontEngineInterface::GetFinals (this=0x5555638350, word="a", word_finals=std::vector of length 0, capacity 2) at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:366
#7  0x000000555556ef30 in ppspeech::FrontEngineInterface::MergeThreeTones (this=0x5555638350, seg_result=std::vector of length 4, capacity 4 = {...}) at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:512
#8  0x000000555556ff34 in ppspeech::FrontEngineInterface::MergeforModify (this=0x5555638350, seg_word_type=std::vector of length 4, capacity 4 = {...}, modify_seg_word_type=std::vector of length 4, capacity 4 = {...})
    at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:634
#9  0x000000555556d4f8 in ppspeech::FrontEngineInterface::Cut (this=0x5555638350, sentence="语音合成服务a", cut_result=std::vector of length 4, capacity 4 = {...}) at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:269
#10 0x000000555556cc70 in ppspeech::FrontEngineInterface::GetSentenceIds (this=0x5555638350, sentence="语音合成服务a", phoneids=std::vector of length 0, capacity 0, toneids=std::vector of length 0, capacity 0)
    at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:183
#11 0x00000055555603b4 in main (argc=1, argv=0x7ffffff288) at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/front_demo/front_demo.cpp:53
  1. 包含空格,内存越界。
./run_front_demo.sh --sentence '语音合成 服务'
gdb ./build/tts_front_demo -ex "set args --sentence '语音合成 服务'"
Program received signal SIGSEGV, Segmentation fault.
__memcpy_generic () at ../sysdeps/aarch64/multiarch/../memcpy.S:100
(gdb) bt
#0  __memcpy_generic () at ../sysdeps/aarch64/multiarch/../memcpy.S:100
#1  0x0000007ff7fa8db0 in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) () from /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/third-party/build/lib/libgflags.so.2.2
#2  0x000000555558c500 in __gnu_cxx::new_allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::construct<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&> (this=0x7fffffeb18, __p=0x555ac91430) at /usr/include/c++/9/ext/new_allocator.h:146                                                                                                                                                                            
#3  0x0000005555584ecc in std::allocator_traits<std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::construct<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&> (__a=..., __p=0x555ac91430) at /usr/include/c++/9/bits/alloc_traits.h:483                                                                                                                                                                        
#4  0x000000555557f010 in std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::push_back (this=0x7fffffeb18, 
    __x=<error: Cannot access memory at address 0xff1f0000ff1b>) at /usr/include/c++/9/bits/stl_vector.h:1189
#5  0x000000555556dd48 in ppspeech::FrontEngineInterface::GetInitialsFinals (this=0x5555638350, word=" ", word_initials=std::vector of length 1, capacity 1 = {...}, word_finals=std::vector of length 0, capacity 2)
    at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:349
#6  0x000000555556df58 in ppspeech::FrontEngineInterface::GetFinals (this=0x5555638350, word=" ", word_finals=std::vector of length 0, capacity 2) at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:366
#7  0x000000555556ef30 in ppspeech::FrontEngineInterface::MergeThreeTones (this=0x5555638350, seg_result=std::vector of length 4, capacity 4 = {...}) at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:512
#8  0x000000555556ff34 in ppspeech::FrontEngineInterface::MergeforModify (this=0x5555638350, seg_word_type=std::vector of length 4, capacity 4 = {...}, modify_seg_word_type=std::vector of length 4, capacity 4 = {...})
    at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:634
#9  0x000000555556d4f8 in ppspeech::FrontEngineInterface::Cut (this=0x5555638350, sentence="语音合成 服务", cut_result=std::vector of length 4, capacity 4 = {...}) at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:269
#10 0x000000555556cc70 in ppspeech::FrontEngineInterface::GetSentenceIds (this=0x5555638350, sentence="语音合成 服务", phoneids=std::vector of length 0, capacity 0, toneids=std::vector of length 0, capacity 0)
    at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/src/front/front_interface.cpp:183
#11 0x00000055555603b4 in main (argc=1, argv=0x7ffffff288) at /home/firefly/work/tts/PaddleSpeech/demos/TTSCppFrontend/front_demo/front_demo.cpp:53

遇到其他情况时会继续追加。

jonkxdd commented 1 year ago

./run_front_demo.sh --sentence '不光彩'

我这边这个词语也会出现这个问题

tuotuoshao commented 12 months ago

能帮我看下https://github.com/PaddlePaddle/PaddleSpeech/issues/3598 这个问题吗,也是前端崩溃