hanickadot / compile-time-regular-expressions

Compile Time Regular Expression in C++
https://twitter.com/hankadusikova
Apache License 2.0
3.22k stars 177 forks source link

Crash on ctre::match #308

Closed josephch closed 4 weeks ago

josephch commented 4 weeks ago

Segmentation fault is seen during pattern match. Below code always give segmentation fault GCC version used : 13.1.0. Reproduced in ctre main branch rev acb2f4d.

#include <iostream>
#include "ctre.hpp"

int main()
{
    std::string testLine = "[4/265] : && /home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/x86_64-linux/usr/bin/arm-rdk-linux-gnueabi/arm-rdk-linux-gnueabi-g++ --sysroot=/home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000 -fPIC -march=armv7ve -mthumb -mfpu=neon  -mfloat-abi=hard -mcpu=cortex-a15 -fno-omit-frame-pointer -fno-optimize-sibling-calls  --sysroot=/home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000  -Os -pipe -g -feliminate-unused-debug-types -fdebug-prefix-map=/home/user123/build_linear/build-abcd123456ef-xy1000/tmp/work/cortexa15t2hf-neon-rdk-linux-gnueabi/ABCFramework-plugins/3.0+gitrnuuday-r1=/usr/src/debug/ABCFramework-plugins/3.0+gitrnuuday-r1 -fdebug-prefix-map=/home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/x86_64-linux= -fdebug-prefix-map=/home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000=  -D_TRACE_LEVEL=0  -DTELEMETRY -fvisibility-inlines-hidden  -march=armv7ve -mthumb -mfpu=neon  -mfloat-abi=hard -mcpu=cortex-a15 -fno-omit-frame-pointer -fno-optimize-sibling-calls  --sysroot=/home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000 -DNDEBUG  -Wl,-O1 -Wl,--hash-style=gnu -Wl,--as-needed -shared -Wl,-soname,libABCFrameworkToken.so -o Token/libABCFrameworkToken.so Token/CMakeFiles/ABCFrameworkToken.dir/Token.cpp.o Token/CMakeFiles/ABCFrameworkToken.dir/TokenJsonRpc.cpp.o Token/CMakeFiles/ABCFrameworkToken.dir/Module.cpp.o  /home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000/usr/lib/libboost_program_options-mt.so /home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000/usr/lib/libboost_system-mt.so /home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000/usr/lib/libboost_filesystem-mt.so /home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000/usr/lib/libboost_system-mt.so /home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000/usr/lib/libboost_filesystem-mt.so /home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000/usr/lib/libABCMfrLib.so /home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000/usr/lib/libABCFrameworkProtocols.so.1.0.0 /home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000/usr/lib/libABCFrameworkCryptalgo.so.1.0.0 /home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000/usr/lib/libABCFrameworkTracing.so.1.0.0 /home/user123/build_linear/build-abcd123456ef-xy1000/tmp/sysroots/abcd123456ef-xy1000/usr/lib/libABCFrameworkCore.so.1.0.0 -pthread && :";
    if(ctre::match<".*(ld.*):[[:blank:]](cannot find.*)">(testLine))
    {
        std::cout << "Match success!\n";
    }
    else
    {
        std::cout << "Match failed!\n";
    }
    return 0;
}
hanickadot commented 4 weeks ago

You are using greedy matching. So it's trying to match as much it can and store every step in between to backtrack. It's using system stack for it. I recommend replacing greedy matches .* with lazy ones .*? or whenever possible possessive ones .*+

josephch commented 4 weeks ago

Thank you Hana Crash did not happen when I changed regex to ".{0,1023}(ld.{0,1023}):[[:blank:]](cannot find.*)" to limit the stack usage. If possible could you please provide a fix for this? Even an exception could be fine, as it would be a defined behaviour, and avoid memory corruption.

hanickadot commented 4 weeks ago

Unfortunately this is impossible to fix, at least with current design of whole CTRE. Size of remaining stack is unknown (and various platforms has different sizes, and main/threads can differ too).

It's same as with C++, no one stops you from using all your stack, it's a code.

One thing which can be done is to use fast_match / fast_search from DFA branche. Which builds DFA and exchange compile-time speed + ability of capture content for matching bounded memory and performance.

josephch commented 4 weeks ago

Thank you Hana for the explanation

hanickadot commented 4 weeks ago

Thanks for filling the issue and I hope I will be able to help you with next one better.