owasp-modsecurity / ModSecurity

ModSecurity is an open source, cross platform web application firewall (WAF) engine for Apache, IIS and Nginx. It has a robust event-based programming language which provides protection from a range of attacks against web applications and allows for HTTP traffic monitoring, logging and real-time analysis.
https://www.modsecurity.org
Apache License 2.0
7.7k stars 1.54k forks source link

Encountering SIGSEGV when parsing multiple rule sets in parallel #3138

Open rkrishn7 opened 3 weeks ago

rkrishn7 commented 3 weeks ago

Describe the bug

Hello!

I'm encountering an error when creating multiple rule sets and adding rules to each of them in a multi-threaded environment.

Logs and dumps

I've traced it back to yylex in yy::seclang_parser::parse():

#0  0x00007bb94eea02d8 in yylex(modsecurity::Parser::Driver&) ()
   from /usr/lib/libmodsecurity.so.3
#1  0x00007bb94ee78c0e in yy::seclang_parser::parse() ()
   from /usr/lib/libmodsecurity.so.3
#2  0x00007bb94eeba170 in modsecurity::Parser::Driver::parse(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
   from /usr/lib/libmodsecurity.so.3
#3  0x00007bb94eeba4a6 in modsecurity::Parser::Driver::parseFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
   from /usr/lib/libmodsecurity.so.3
#4  0x00007bb94eed0a17 in modsecurity::RulesSet::loadFromUri(char const*) ()
   from /usr/lib/libmodsecurity.so.3
#5  0x00007bb94eed0b68 in msc_rules_add_file ()
   from /usr/lib/libmodsecurity.so.3

...(omitted for brevity)

Is this a known issue? Is Rules meant to be initialized and filled only once and in a single-threaded context?

airween commented 3 weeks ago

Hi @rkrishn7,

could you share your (minimal) config (at least to emulate this behavior)? And I assume you have your own application, so a minimal code example also would be fine.

rkrishn7 commented 3 weeks ago

Hey @airween,

Sure, I was able to create a minimal reproducible example in C (using libmodsecurity v3.0.12):

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <modsecurity/modsecurity.h>
#include <modsecurity/rules_set.h>

void *create_rules(void *arg)
{
    const char *plain_rules = arg;

    const char *err = NULL;

    RulesSet *rules;

    rules = msc_create_rules_set();
    if (rules == NULL)
    {
        return NULL;
    }

    int retval = msc_rules_add(rules, plain_rules, &err);

    if (retval < 0)
    {
        fprintf(stderr, "Error - msc_rules_add(): %s\n", err);
        exit(EXIT_FAILURE);
    }

    return NULL;
}

int main()
{
    pthread_t thread1, thread2;
    int iret1, iret2;

    const char *plain_rules_1 = "SecRuleEngine On\n";
    const char *plain_rules_2 = "SecRuleEngine On\n";

    iret1 = pthread_create(&thread1, NULL, create_rules, (void *)plain_rules_2);
    if (iret1)
    {
        fprintf(stderr, "Error - pthread_create() return code: %d\n", iret1);
        exit(EXIT_FAILURE);
    }

    iret2 = pthread_create(&thread2, NULL, create_rules, (void *)plain_rules_1);
    if (iret2)
    {
        fprintf(stderr, "Error - pthread_create() return code: %d\n", iret2);
        exit(EXIT_FAILURE);
    }

    // Wait till threads are complete before main continues.
    pthread_join(thread1, NULL);
    pthread_join(thread2, NULL);

    return 0;
}

which yields the following output (of course the boundary errors are sporadic):

Error - msc_rules_add(): Rules error. File: <<reference missing or not informed>>. Line: 1. Column: 17. Invalid input:  
fish: Job 1, './test' terminated by signal SIGSEGV (Address boundary error)
airween commented 2 weeks ago

Hi @rkrishn7,

thanks for the example.

Perhaps you already know that libmodsecurity3 uses Bison as configuration parser (lexer and parser are in the source tree).

Unfortunately Bison generates a non-reetrant parser, therefore you can't use that in multi-threading environment in this state. As you can see there is a (theoretical?) solution, but parser is one of the most sensitive part of the library, so I wouldn't touch it.

But you can try to use mutex locks - eg. this worked for me (based on your given example above):

pthread_mutex_t lock;

void *create_rules(void *arg)
{
    const char *plain_rules = arg;

    const char *err = NULL;

    RulesSet *rules;

    rules = msc_create_rules_set();
    if (rules == NULL)
    {
        return NULL;
    }

    pthread_mutex_lock(&lock);
    int retval = msc_rules_add(rules, plain_rules, &err);
    pthread_mutex_unlock(&lock);

    if (retval < 0)
    {
        fprintf(stderr, "Error - msc_rules_add(): %s\n", err);
        exit(EXIT_FAILURE);
    }
    else {
        fprintf(stdout, "Done - msc_rules_add(): %s\n", err);
    }

    return NULL;
}
rkrishn7 commented 2 weeks ago

Ah thank you @airween for the context. Yes, I thought it was related to parsing per the backtrace.

In regards to serializing parsing via a Mutex - definitely agreed. I landed on this approach for my rust crate that provides an interface to libmodsecurity. See: https://github.com/rkrishn7/rust-modsecurity/blob/55cb5e65df0c7771f74524f7e4144046cb0d323b/src/rules.rs#L20

However, it is probably worth mentioning the thread-safety of certain APIs (e.g. ModSecurity and RulesSet) in the documentation somewhere.