Sigil-Ebook / Sigil

Sigil is a multi-platform EPUB ebook editor
GNU General Public License v3.0
5.99k stars 578 forks source link

PCRE expression causes segmentation fault #627

Closed 0xf00fc7c8 closed 3 years ago

0xf00fc7c8 commented 3 years ago

When I use the find tool with the PCRE expression <ins.*class="adsbygoogle"(.|\n)*<\/ins>, and it finds a match, it causes a segmentation fault. After compiling with debug symbols and running a backtrace, I notice there seems to an infinite loop caused by the search and replace logic in 3rdparty/pcre/pcre_exec.c. I couldn't exactly tell what was wrong with the code, but I'm guessing the infinte loop caused a stack overflow, which caused a segfault? I added as much of the backtrace as I thought would be useful. I also uploaded a epub file I was editing (as a zip file). Disciple of Immortal.zip

#16360 0x0000555555a3a823 in match (eptr=<optimized out>, ecode=0x555557bc68a6, mstart=0x55555753479a, offset_top=4, md=<optimized out>, eptrb=0x0, rdepth=<optimized out>)
    at /home/john/Code/Sigil/3rdparty/pcre/pcre_exec.c:2061
#16361 0x0000555555a323b9 in match (eptr=<optimized out>, ecode=0x555557bc6896, mstart=0x55555753479a, offset_top=4, md=<optimized out>, eptrb=0x0, rdepth=<optimized out>)
    at /home/john/Code/Sigil/3rdparty/pcre/pcre_exec.c:983
#16362 0x0000555555a3a823 in match (eptr=<optimized out>, ecode=0x555557bc68a6, mstart=0x55555753479a, offset_top=4, md=<optimized out>, eptrb=0x0, rdepth=<optimized out>)
    at /home/john/Code/Sigil/3rdparty/pcre/pcre_exec.c:2061
#16363 0x0000555555a323b9 in match (eptr=<optimized out>, ecode=0x555557bc6896, mstart=0x55555753479a, offset_top=2, md=<optimized out>, eptrb=0x0, rdepth=<optimized out>)
    at /home/john/Code/Sigil/3rdparty/pcre/pcre_exec.c:983
#16364 0x0000555555a3349f in match (eptr=<optimized out>, ecode=0x555557bc6894, mstart=0x55555753479a, offset_top=2, md=<optimized out>, eptrb=0x0, rdepth=<optimized out>)
    at /home/john/Code/Sigil/3rdparty/pcre/pcre_exec.c:1878
#16365 0x0000555555a362ea in match (eptr=<optimized out>, ecode=<optimized out>, mstart=0x55555753479a, offset_top=2, md=<optimized out>, eptrb=0x0, rdepth=<optimized out>)
    at /home/john/Code/Sigil/3rdparty/pcre/pcre_exec.c:5934
#16366 0x0000555555a3c83e in pcre16_exec (argument_re=0x555557bc67f0, extra_data=0x555558719750, subject=0x555557533f88, length=length@entry=18427, 
    start_offset=start_offset@entry=0, options=options@entry=0, offsets=0x5555586bea70, offsetcount=6) at /home/john/Code/Sigil/3rdparty/pcre/pcre_exec.c:6935
#16367 0x00005555559962b4 in SPCRE::getFirstMatchInfo (this=this@entry=0x55555748b460, text=...) at /usr/include/x86_64-linux-gnu/qt5/QtCore/qstring.h:1027
#16368 0x00005555559a9721 in CodeViewEditor::FindNext (this=0x55555bd2c270, search_regex=..., search_direction=Searchable::Direction_Down, misspelled_words=<optimized out>, 
    ignore_selection_offset=<optimized out>, wrap=<optimized out>, marked_text=false) at /home/john/Code/Sigil/src/ViewEditors/CodeViewEditor.cpp:856
#16369 0x0000555555a02449 in FindReplace::FindText (this=0x5555565c60f0, direction=Searchable::Direction_Down) at /home/john/Code/Sigil/src/MainUI/FindReplace.cpp:605
#16370 0x0000555555a029c5 in FindReplace::FindClicked (this=0x5555565c60f0) at /home/john/Code/Sigil/src/MainUI/FindReplace.cpp:268
kevinhendricks commented 3 years ago

Either way, I have pushed that change to master.

The build will default to using pcre's JIT either via our 3rdparty or via a system pcre library unless PCRE_NO_JIT is defined during the build.

In other words I used "#ifndef PCRE_NO_JIT" to protect all of the pcre JIT code snippets I added in the previous commit. I did a test build without PCRE_NO_JIT being defined (the default) and it compiled and runs just fine on my macOS.

We can add that to the Linux/*nix build documentation.

kevinhendricks commented 3 years ago

In fact, if I ever get a newer macOS M1/M*/AArch64/ARM64 machine, I am not sure that the pcre jit (especially in pcre version 1 library since it is now at end of life) would ever support it so that flag would be needed for that new arch as well.

At some point I will look into everything that needs to be changed in Sigil in order to use the latest pcre2 version of the library.