NLnetLabs / unbound

Unbound is a validating, recursive, and caching DNS resolver.
https://nlnetlabs.nl/unbound
BSD 3-Clause "New" or "Revised" License
3.13k stars 359 forks source link

regexps with libpcre-8.43 in Unbound #77

Open iruzanov opened 5 years ago

iruzanov commented 5 years ago

Hello, Wouter!

I remeber about my patch related to calc_hash() function, but now i concerned on regexps in the best and fastest in the world resolver ;) So, what i need to:

  1. I would like to filter (answering of NXDOMAIN) incoming DNS queries using a set of regular expressions, like these (in the example the set with 6 rules): "^[a-z,0-9,-](.)?[a-z,0-9,-](xxx2)[a-z,0-9,-].(ripn)[a-z,0-9,-.](.)?$" "^[a-z,0-9,-](.)?[a-z,0-9,-](xxx3)[a-z,0-9,-].(ripn)[a-z,0-9,-.](.)?$" "^[a-z,0-9,-](.)?[a-z,0-9,-](xxx4)[a-z,0-9,-].(ripn)[a-z,0-9,-.](.)?$" "^[a-z,0-9,-](.)?[a-z,0-9,-](xxx5)[a-z,0-9,-].(ripn)[a-z,0-9,-.](.)?$" "^[a-z,0-9,-](.)?[a-z,0-9,-](xxx6)[a-z,0-9,-].(ripn)[a-z,0-9,-.](.)?$" "^[a-z,0-9,-](.)?[a-z,0-9,-](xxx7)[a-z,0-9,-].(ripn)[a-z,0-9,-.](.)?$"
  2. And i want that the perfomance of Unbound is not affected by filtering incoming queries using these filters
  3. All the rules need to be loaded/reloaded from unbound.conf

This feature might be resolvable with Python (using python module in Unbound) but the perfomance in this case is too poor (15000 replies per second). And the bottleneck is invalidateQueryInCache() system call from Python script.

And what i have done at the moment:

  1. I wrote simple C-file (with its own header file) with calls from libpcre-8.43 (PCRE1). I won't give detailed descrition of the functions, i think you will everything see by youself:
    • the header fastregexp/fastregexp.h:

      include

struct my_regex { pcre my_reCompiled; pcre_extra my_pcreExtra; pcre_jit_stack my_jit_stack; struct my_regex next; };

void cleanup_fast_regexp(struct my_regex my_regex); int do_fast_regexp(struct my_regex my_regex, char testString); struct my_regex study_fast_regexp(struct my_regex my_regex); struct my_regex compile_fast_regexp(struct my_regex my_regex, char aRegexStrV[], int num_aRegexStrV);

void cleanup_fast_regexp(struct my_regex my_regex) { struct my_regex my_regex_next;

    log_err("Cleaning up all regex structures");

    while(my_regex_next != NULL) {
            my_regex_next = my_regex->next;
            free(my_regex);
            my_regex = my_regex_next;
    }

}

int do_fast_regexp(struct my_regex my_regex, char testString) { int subStrVec[30];

while(my_regex != NULL) { int pcreExecRet = pcre_jit_exec(my_regex->my_reCompiled, my_regex->my_pcreExtra, testString, strlen(testString), 0, 0, subStrVec, 30, my_regex->my_jit_stack);

       if(pcreExecRet >= 0)
            return 1;
       my_regex = my_regex->next;

} / end of while /

return 0; } struct my_regex study_fast_regexp(struct my_regex my_regex) { pcre_extra pcreExtra; const char pcreErrorStr; struct my_regex my_regex_start = my_regex; pcre_jit_stack jit_stack;

while(my_regex != NULL) { pcreExtra = pcre_study(my_regex->my_reCompiled, PCRE_STUDY_JIT_COMPILE, &pcreErrorStr); / pcre_study() returns NULL for both errors and when it can not optimize the regex. The last argument is how one checks for errors (it is NULL if everything works, and points to an error string otherwise. / if(pcreErrorStr != NULL) { log_err("fastregexp: JIT optimization error: %s. Cleaning up all regex structures", pcreErrorStr); cleanup_fast_regexp(my_regex_start); return NULL; }

    jit_stack = pcre_jit_stack_alloc(32*1024, 1024*1024);
    pcre_assign_jit_stack(pcreExtra, NULL, jit_stack);

    my_regex->my_pcreExtra = pcreExtra;
    my_regex->my_jit_stack = jit_stack;
    my_regex = my_regex->next;

} / end of while /

return my_regex_start; } struct my_regex compile_fast_regexp(struct my_regex my_regex, char aRegexStrV[], int num_aRegexStrV) { pcre reCompiled; const char *pcreErrorStr; int pcreErrorOffset; char **aStrRegex;

struct my_regex my_regex_prev = NULL; struct my_regex my_regex_start = NULL;

for(int i=0; i<num_aRegexStrV; i++) { //log_err("the regex is: %s", aRegexStrV[i]); if((my_regex = (struct my_regex*) malloc(sizeof(struct my_regex))) == NULL) { log_err("fastregexp: general memory allocation error"); return NULL; }

    if(my_regex_prev != NULL) {
            my_regex_prev->next = my_regex;
    } else {
            my_regex_start = my_regex;
    }

    reCompiled = pcre_compile(aRegexStrV[i], 0, &pcreErrorStr, &pcreErrorOffset, NULL);
    if(reCompiled == NULL) {
            log_err("fastregexp: error allocating memory for PCRE stack: regex is %s: the reason: %s. Cleaning up all regex structures", aRegexStrV[i], pcreErrorStr);
            cleanup_fast_regexp(my_regex_start);
            return NULL;
    }

    my_regex->my_reCompiled = reCompiled;
    my_regex->next = NULL;
    my_regex_prev = my_regex;

} / end of for /

return my_regex_start;

//pcre_free_substring(psubStrMatchStr); pcre_free(reCompiled);

// Free up the EXTRA PCRE value (may be NULL at this point) // if(pcreExtra != NULL) { //#ifdef PCRE_CONFIG_JIT // pcre_free_study(pcreExtra); //#else // pcre_free(pcreExtra); //#endif // } }

Next, i patched your following source files: --- unbound-1.9.2.orig/util/module.h 2019-06-17 11:50:16.000000000 +0300 +++ unbound-1.9.2/util/module.h 2019-09-16 11:54:20.302813000 +0300 @@ -156,6 +156,8 @@

include "util/storage/lruhash.h"

include "util/data/msgreply.h"

include "util/data/msgparse.h"

+//igorr +#include "fastregexp/fastregexp.h" struct sldns_buffer; struct alloc_cache; struct rrset_cache; @@ -512,6 +514,10 @@

    /* Make every mesh state unique, do not aggregate mesh states. */
    int unique_mesh;

+

--- unbound-1.9.2.orig/daemon/worker.c 2019-06-17 11:50:16.000000000 +0300 +++ unbound-1.9.2/daemon/worker.c 2019-09-17 13:00:20.176700000 +0300 @@ -1892,6 +1892,11 @@ worker->env.cfg->stat_interval); worker_restart_timer(worker); } +

@@ -1933,6 +1938,8 @@ alloc_clear(&worker->alloc); regional_destroy(worker->env.scratch); regional_destroy(worker->scratchpad);

--- unbound-1.9.2.orig/iterator/iterator.c 2019-06-17 11:50:16.000000000 +0300 +++ unbound-1.9.2/iterator/iterator.c 2019-09-16 12:34:32.062665000 +0300 @@ -160,6 +160,7 @@ outbound_list_init(&iq->outlist); iq->minimise_count = 0; iq->minimise_timeout_count = 0; + if (qstate->env->cfg->qname_minimisation) iq->minimisation_state = INIT_MINIMISE_STATE; else @@ -2576,6 +2577,23 @@ enum response_type type; iq->num_current_queries--;

--- unbound-1.9.2.orig/util/config_file.h 2019-06-17 11:50:16.000000000 +0300 +++ unbound-1.9.2/util/config_file.h 2019-09-16 13:07:10.312655000 +0300 @@ -575,6 +575,10 @@ int redis_timeout;

endif

endif

--- unbound-1.9.2.orig/util/config_file.c 2019-06-17 11:50:16.000000000 +0300 +++ unbound-1.9.2/util/config_file.c 2019-09-16 17:28:00.678244000 +0300 @@ -327,6 +327,9 @@ cfg->cachedb_backend = NULL; cfg->cachedb_secret = NULL;

endif

--- unbound-1.9.2.orig/util/configparser.y 2019-06-17 11:50:16.000000000 +0300 +++ unbound-1.9.2/util/configparser.y 2019-09-16 17:27:35.678485000 +0300 @@ -158,6 +158,7 @@ %token VAR_IPSECMOD_MAX_TTL VAR_IPSECMOD_WHITELIST VAR_IPSECMOD_STRICT %token VAR_CACHEDB VAR_CACHEDB_BACKEND VAR_CACHEDB_SECRETSEED %token VAR_CACHEDB_REDISHOST VAR_CACHEDB_REDISPORT VAR_CACHEDB_REDISTIMEOUT +%token VAR_REGEXP VAR_REGEXP_PATTERN %token VAR_UDP_UPSTREAM_WITHOUT_DOWNSTREAM VAR_FOR_UPSTREAM %token VAR_AUTH_ZONE VAR_ZONEFILE VAR_MASTER VAR_URL VAR_FOR_DOWNSTREAM %token VAR_FALLBACK_ENABLED VAR_TLS_ADDITIONAL_PORT VAR_LOW_RTT VAR_LOW_RTT_PERMIL @@ -174,7 +175,7 @@ forwardstart contents_forward | pythonstart contents_py | rcstart contents_rc | dtstart contents_dt | viewstart contents_view | dnscstart contents_dnsc | cachedbstart contents_cachedb |

--- unbound-1.9.2.orig/util/configlexer.lex 2019-06-17 11:50:16.000000000 +0300 +++ unbound-1.9.2/util/configlexer.lex 2019-09-16 15:04:30.764354000 +0300 @@ -483,6 +483,8 @@ redis-server-host{COLON} { YDVAR(1, VAR_CACHEDB_REDISHOST) } redis-server-port{COLON} { YDVAR(1, VAR_CACHEDB_REDISPORT) } redis-timeout{COLON} { YDVAR(1, VAR_CACHEDB_REDISTIMEOUT) } +regexp{COLON} { YDVAR(0, VAR_REGEXP) } +pattern{COLON} { YDVAR(1, VAR_REGEXP_PATTERN) } udp-upstream-without-downstream{COLON} { YDVAR(1, VAR_UDP_UPSTREAM_WITHOUT_DOWNSTREAM) } tcp-connection-limit{COLON} { YDVAR(2, VAR_TCP_CONNECTION_LIMIT) } <INITIAL,val>{NEWLINE} { LEXOUT(("NL\n")); cfg_parser->line++; }

--- unbound-1.9.2.orig/Makefile 2019-09-17 13:38:35.414726000 +0300 +++ unbound-1.9.2/Makefile 2019-09-16 12:31:51.334154000 +0300 @@ -59,14 +59,14 @@ PYTHON_CPPFLAGS=-I. -I/usr/local/include/python2.7 CFLAGS=-DSRCDIR=$(srcdir) -g -O2 -D_THREAD_SAFE -pthread LDFLAGS=-L/usr/local/lib -L/usr/local/lib -L/usr/local/lib -LIBS=-lutil -levent -L/usr/local/lib -L/usr/local/lib/python2.7 -L. -lpython2.7 -lcrypto -lhiredis +LIBS=-lutil -levent -L/usr/local/lib -L/usr/local/lib/python2.7 -L. -lpython2.7 -lcrypto -lhiredis -lpcre LIBOBJS= ${LIBOBJDIR}explicit_bzero$U.o ${LIBOBJDIR}reallocarray$U.o

filter out ctime_r from compat obj.

LIBOBJ_WITHOUT_CTIME= explicit_bzero.o reallocarray.o LIBOBJ_WITHOUT_CTIMEARC4= explicit_bzero.o RUNTIME_PATH= -R/usr/local/lib DEPFLAG=-MM -DATE=20190917 +DATE=20190912 LIBTOOL=$(libtool) BUILD=build/ UBSYMS=-export-symbols $(srcdir)/libunbound/ubsyms.def @@ -126,7 +126,8 @@ edns-subnet/edns-subnet.c edns-subnet/subnetmod.c \ edns-subnet/addrtree.c edns-subnet/subnet-whitelist.c \ cachedb/cachedb.c cachedb/redis.c respip/respip.c $(CHECKLOCK_SRC) \ -$(DNSTAP_SRC) $(DNSCRYPT_SRC) $(IPSECMOD_SRC) +$(DNSTAP_SRC) $(DNSCRYPT_SRC) $(IPSECMOD_SRC) \ +fastregexp/fastregexp.c COMMON_OBJ_WITHOUT_NETCALL=dns.lo infra.lo rrset.lo dname.lo msgencode.lo \ as112.lo msgparse.lo msgreply.lo packed_rrset.lo iterator.lo iter_delegpt.lo \ iter_donotq.lo iter_fwd.lo iter_hints.lo iter_priv.lo iter_resptype.lo \ @@ -139,7 +140,7 @@ validator.lo val_kcache.lo val_kentry.lo val_neg.lo val_nsec3.lo val_nsec.lo \ val_secalgo.lo val_sigcrypt.lo val_utils.lo dns64.lo cachedb.lo redis.lo authzone.lo \ $(SUBNET_OBJ) $(PYTHONMOD_OBJ) $(CHECKLOCK_OBJ) $(DNSTAP_OBJ) $(DNSCRYPT_OBJ) \ -$(IPSECMOD_OBJ) respip.lo +$(IPSECMOD_OBJ) respip.lo fastregexp.lo COMMON_OBJ_WITHOUT_UB_EVENT=$(COMMON_OBJ_WITHOUT_NETCALL) netevent.lo listen_dnsport.lo \ outside_network.lo COMMON_OBJ=$(COMMON_OBJ_WITHOUT_UB_EVENT) ub_event.lo @@ -692,7 +693,7 @@ $(srcdir)/services/modstack.h $(srcdir)/util/net_help.h $(srcdir)/util/regional.h $(srcdir)/util/data/dname.h \ $(srcdir)/util/data/msgencode.h $(srcdir)/util/fptr_wlist.h $(srcdir)/util/tube.h $(srcdir)/util/config_file.h \ $(srcdir)/util/random.h $(srcdir)/sldns/wire2str.h $(srcdir)/sldns/str2wire.h $(srcdir)/sldns/parseutil.h \

Thats all if i didn't forget anything. About Makefile - i know, that is the right way to patch Makefile.in. But now i'm interesting in final result of stabilty and perfomance. And yacc/lex-sources - i tried to add my two options (regexp: and pattern:) using existing declarations of config options. And it was too hard for me ;)

Now what i have:

But i have several issues:

What i would like now - is your authoritative opinion about if all my actions is right or maybe i could (and this is most likely) be wrong in my code. Could you please revise my pathces and tell me what i have to do else

Big thank you in advance!

wcawijngaards commented 5 years ago

Hi! Have you tried using the python module? That means you could write python for this and regex should be simple in there and also replacing answers with other ones from the python module? That could be an easy method to get what you wanted? Best regards, Wouter

iruzanov commented 5 years ago

Yes i have. I tried to use python module and the script used at Unbound start is: from ctypes import * cregexp = cdll.LoadLibrary("/srvs/i.ruzanov/install/unbound-1.9.2/pythonmod/fastregexp.so")

class pcre_extra(Structure): fields = [ ("flags", c_long), ("data", c_void_p), ("callout", c_void_p), ("tables", c_char_p), ("match_limit_recursion", c_ulong) ] pcre_extra_p = POINTER(pcre_extra) pcre_p = c_void_p

class my_regex(Structure): fields = [ ("my_reCompiled", pcre_p), ("my_pcreExtra", pcre_extra_p), ("my_jit_stack", POINTER(c_void_p)), ("next", POINTER(c_void_p)) ] my_regex_p = POINTER(my_regex)

pcre_compile = cregexp.compile_fast_regexp pcre_compile.restype = my_regex_p pcre_compile.argstype = [] re = pcre_compile()

pcre_study = cregexp.study_fast_regexp pcre_study.restype = my_regex_p pcre_study.argstype = my_regex_p s = pcre_study(re)

pcre_exec = cregexp.do_fast_regexp pcre_exec.restype = c_int pcre_exec.argstype = [my_regex_p, c_char_p]

def init(id, cfg): log_info("my-pythonmod: init called, module id is %d port: %d script: %s" % (id, cfg.port, cfg.python_script)) return True

def deinit(id): log_info("my-pythonmod: deinit called, module id is %d" % id) return True

def inform_super(id, qstate, superqstate, qdata): return True

def operate(id, event, qstate, qdata): log_info("my-pythonmod: operate called, id: %d, event:%s" % (id, strmodulevent(event)))

if qstate.return_msg:
    if pcre_exec(s, qstate.qinfo.qname_str) == 2:
            invalidateQueryInCache(qstate, qstate.return_msg.qinfo)
            log_info("my-pythonmod: ok, i've done: %s is filtered" % qstate.qinfo.qname_str)
            qstate.return_rcode = RCODE_NXDOMAIN
            qstate.ext_state[id] = MODULE_ERROR
            return False
    #log_info("my-pythonmod: done with %s" % qstate.qinfo.qname_str)

if event == MODULE_EVENT_NEW:
    qstate.ext_state[id] = MODULE_WAIT_MODULE
    return True

if event == MODULE_EVENT_MODDONE:
    log_info("my-pythonmod: previous module done")
    qstate.ext_state[id] = MODULE_FINISHED
    return True

if event == MODULE_EVENT_PASS:
    log_info("my-pythonmod: event_pass")
    qstate.ext_state[id] = MODULE_WAIT_MODULE
    return True

return True

log_info("my-pythonmod: script loaded.")

I have compiles fastregexp.so library with the same calls just like in fastregexp.c and loaded the lib via python ctypes. But perfomance was only 15000 rps. Its very small for me.

wcawijngaards commented 5 years ago

Is that performance because of log_info, or because of python? Perhaps if you comment out the log_info from the operate() function, it would be a lot faster. Logging could be too slow, for eg. 100k qps. Also, unbound caches the responses, from python, if you make them in the operate callback and then there is the normal unbound response speed for them. But nice to hear you tried it and the pythond module worked for that!

iruzanov commented 5 years ago

Yes, Wouter, i commented even all of log_info() calls in operate function within "if qstate.return_msg:" block. But it did not add the perfomance. I even commented condition check "if pcre_exec() == 1:" to just call invalidateQueryInCache() in any case - it gave me the same 15000 responses per second. PS in my tests all of test queires - the queries each of which matches to some pattern from regexp set

iruzanov commented 5 years ago

Anyway, yes, python module works fine. And at least i'm planning to use the module for my purposes related to get parameters from SQL-frontend and load the parameters to redis cachedb ;) Also i would like to modify unbound.conf config using python module.

iruzanov commented 5 years ago

And still two things about compiling Unbound with libpcre:

  1. in fastregexp/fastregexp.c, i added lines in cleanup_fast_regexp() function, the while-block : pcre_free(my_regex->my_reCompiled); pcre_free_study(my_regex->my_pcreExtra); pcre_jit_stack_free(my_regex->my_jit_stack); so memory leakage is far less now after each unbound-control reload And 2: I remeber still one source file i patched: --- unbound-1.9.2.orig/smallapp/unbound-checkconf.c 2019-06-17 11:50:16.000000000 +0300 +++ unbound-1.9.2/smallapp/unbound-checkconf.c 2019-09-16 17:29:18.653649000 +0300 @@ -439,6 +439,12 @@ aclchecks(cfg); tcpconnlimitchecks(cfg);