rsyslog / liblognorm

a fast samples-based log normalization library
http://www.liblognorm.com
GNU Lesser General Public License v2.1
99 stars 64 forks source link

Rulebase parse failure on AIX causing a segfault #353

Open StrongestNumber9 opened 2 years ago

StrongestNumber9 commented 2 years ago

Running on AIX 7.1 and newest liblognorm release

# lognormalizer -V
lognormalizer version: 2.0.6.master
liblognorm version: 2.0.6.master
        advanced stats: not available

Test file, nothing interesting.

# cat test_ruleset.rb
version=2
rule=:%this:number% %that:number% %-:rest%

Try it normally

# echo "123 234 irrelevant" | lognormalizer -v -r test_ruleset.rb
liblognorm: loading rulebase file 'test_ruleset.rb'
liblognorm: rulebase version is 2

liblognorm: read rulebase line[~2]: 'rule=:%this:number% %that:number% %-:rest%'
liblognorm: rule line to add: ':%this:number% %that:number% %-:rest%'
liblognorm: addSampToTree 0 of 36
liblognorm: parsed literal: ''
liblognorm: parsed field: 'this'
liblognorm: field type 'number', i 12
liblognorm: ln_pdagAddParserInternal: { "name": "this", "type": "number" }
liblognorm: ln_pdagAddParserInstance: { "name": "this", "type": "number" }, nextnode 0
liblognorm: assigned priority is 30000
liblognorm: pdag: 20000f08, parser 200021c8
liblognorm: addSampToTree 13 of 36
liblognorm: parsed literal: ' '
liblognorm: ln_pdagAddParserInternal: { "type": "literal", "text": " " }
liblognorm: ln_pdagAddParserInstance: { "type": "literal", "text": " " }, nextnode 0
liblognorm: assigned priority is 30000
liblognorm: pdag: 20002248, parser 200022b8
liblognorm: parsed field: 'that'
liblognorm: field type 'number', i 26
liblognorm: ln_pdagAddParserInternal: { "name": "that", "type": "number" }
liblognorm: ln_pdagAddParserInstance: { "name": "that", "type": "number" }, nextnode 0
liblognorm: assigned priority is 30000
liblognorm: pdag: 20002488, parser 20002348
liblognorm: addSampToTree 27 of 36
liblognorm: parsed literal: ' '
liblognorm: ln_pdagAddParserInternal: { "type": "literal", "text": " " }
liblognorm: ln_pdagAddParserInstance: { "type": "literal", "text": " " }, nextnode 0
liblognorm: assigned priority is 30000
liblognorm: pdag: 20002448, parser 200022b8
liblognorm: parsed field: '-'
liblognorm: field type 'rest', i 35
liblognorm: ln_pdagAddParserInternal: { "name": "-", "type": "rest" }
liblognorm: ln_pdagAddParserInstance: { "name": "-", "type": "rest" }, nextnode 0
liblognorm: assigned priority is 30000
liblognorm: pdag: 200025a8, parser 200022b8
liblognorm: parsed literal: ''
liblognorm: end addSampToTree 36 of 36
liblognorm: optimizing main pdag component
liblognorm: pre sort, parser 0:this[7680016]
liblognorm: post sort, parser 0:this[7680016]
liblognorm: optimizing 20002248: field 0 type 'number', name 'this': 'UNKNOWN':
liblognorm: pre sort, parser 0:[7680004]
liblognorm: post sort, parser 0:[7680004]
liblognorm: optimizing 20002488: field 0 type 'literal', name '': ' ':
liblognorm: pre sort, parser 0:that[7680016]
liblognorm: post sort, parser 0:that[7680016]
liblognorm: optimizing 20002448: field 0 type 'number', name 'that': 'UNKNOWN':
liblognorm: pre sort, parser 0:[7680004]
liblognorm: post sort, parser 0:[7680004]
liblognorm: optimizing 200025a8: field 0 type 'literal', name '': ' ':
liblognorm: pre sort, parser 0:[7680255]
liblognorm: post sort, parser 0:[7680255]
liblognorm: optimizing 200026c8: field 0 type 'rest', name '': 'UNKNOWN':
liblognorm: finished optimizing main pdag component
Segmentation fault (core dumped)

Change newlines to dos style

# unix2dos test_ruleset.rb
unix2dos: converting file test_ruleset.rb to DOS format...

And try again..

# echo "123 234 irrelevant" | lognormalizer -v -r test_ruleset.rb
liblognorm: loading rulebase file 'test_ruleset.rb'
liblognorm: rulebase version is 1

'iblognorm: read rulebase line[~1]: 'version=2
liblognorm: invalid record type detected: 'version'
liblognorm: ---------------------------------------
liblognorm: subtree 20000f78 (prefix: '', children: 0 literals, 0 fields) [visited 0 backtracked 0 terminated 0]
liblognorm: =======================================
'iblognorm: read rulebase line[~2]: 'rule=:%this:number% %that:number% %-:rest%
'iblognorm: sample line to add: ':%this:number% %that:number% %-:rest%

liblognorm: addSampToTree 0 of 37
liblognorm: parsed literal: ''
liblognorm: buildPTree: begin at 20000f78, offs 0
liblognorm: case 1.1
liblognorm: parsed field: 'this'
liblognorm: got new subtree 200025b8
liblognorm: prev subtree 20000f78
liblognorm: new subtree 200025b8
liblognorm: addSampToTree 13 of 37
liblognorm: parsed literal: ' '
liblognorm: buildPTree: begin at 200025b8, offs 0
liblognorm: case 3.1
liblognorm: addPTree: offs 0
liblognorm: setPrefix lenBuf 1, offs 0
liblognorm: parsed field: 'that'
liblognorm: got new subtree 20002a48
liblognorm: prev subtree 200025b8
liblognorm: new subtree 20002a48
liblognorm: addSampToTree 27 of 37
liblognorm: parsed literal: ' '
liblognorm: buildPTree: begin at 20002a48, offs 0
liblognorm: case 3.1
liblognorm: addPTree: offs 0
liblognorm: setPrefix lenBuf 1, offs 0
liblognorm: parsed field: '-'
liblognorm: got new subtree 20002ed8
liblognorm: prev subtree 20002a48
liblognorm: new subtree 20002ed8
liblognorm: addSampToTree 36 of 37
'iblognorm: parsed literal: '
liblognorm: buildPTree: begin at 20002ed8, offs 0
liblognorm: case 3.1
liblognorm: addPTree: offs 0
liblognorm: setPrefix lenBuf 1, offs 0
liblognorm: end addSampToTree 37 of 37
liblognorm: ---------------------------------------
liblognorm: subtree 20000f78 (prefix: '', children: 0 literals, 1 fields) [visited 0 backtracked 0 terminated 0]
liblognorm: field this:
liblognorm:   subtree 200025b8 (prefix: ' ', children: 0 literals, 1 fields) [visited 0 backtracked 0 terminated 0]
liblognorm:   field that:
liblognorm:     subtree 20002a48 (prefix: ' ', children: 0 literals, 1 fields) [visited 0 backtracked 0 terminated 0]
liblognorm:     field -:
', children: 0 literals, 0 fields) [visited 0 backtracked 0 terminated 0]
liblognorm: =======================================
number of tree nodes: 5
To normalize: '123 234 irrelevant'
liblognorm: 0: enter parser, tree 20000f78
liblognorm: 0: prefix compare succeeded, still valid
liblognorm: 0:trying parser for field 'this': f15caf58
liblognorm: parser returns 0, parsed 3
liblognorm: 0: potential hit, trying subtree 200025b8
liblognorm: 3: enter parser, tree 200025b8
liblognorm: 3: prefix compare ' ', ' '
liblognorm: 4: prefix compare succeeded, still valid
liblognorm: 4:trying parser for field 'that': f15caf58
liblognorm: parser returns 0, parsed 3
liblognorm: 4: potential hit, trying subtree 20002a48
liblognorm: 7: enter parser, tree 20002a48
liblognorm: 7: prefix compare ' ', ' '
liblognorm: 8: prefix compare succeeded, still valid
liblognorm: 8:trying parser for field '-': f15caf34
liblognorm: 8 no field, trying subtree char 'i': 0
liblognorm: rule has rest motif, forcing match via it
liblognorm: 18 returns -1
liblognorm: 8: parser matches at 8
liblognorm: 8 returns 0
liblognorm: 4: subtree returns 14
liblognorm: 4: parser matches at 4
liblognorm: 4 returns 0
liblognorm: 0: subtree returns 18
liblognorm: 0: parser matches at 0
liblognorm: 0 returns 0
liblognorm: final result for normalizer: left 0, endNode 20002ed8, isTerminal 1, tagbucket 0
normalized: '{ "that": "234", "this": "123" }'
{ "that": "234", "this": "123" }
liblognorm: exitCtx 20000eb8
liblognorm: delete 20000f08[1]: 

Truss outputs as well, if any use.

bash-5.1# echo "123 234 irrelevant" | truss lognormalizer -r test_ruleset.rb
execve("lognormalizer", 0x2FF22C34, 0x20015668)  argc: 3
read_sysconfig(0xF0785888, 0x00000010, 0x00000005, 0xD07C6500, 0xF02269A8, 0x00000008, 0x06010000, 0xF07F83C8) = 0x00000000
__loadx(0x0A040000, 0xD045E470, 0x00000008, 0x20000D64, 0x00000041) = 0x00000000
sbrk(0x00000000)                                = 0x20000EB0
vmgetinfo(0x2FF225F0, 7, 16)                    = 0
sbrk(0x00000000)                                = 0x20000EB0
__libc_sbrk(0x00000000)                         = 0x20000EB0
kopen("test_ruleset.rb", O_RDONLY)              = 3
kioctl(3, 22528, 0x00000000, 0x00000000)        Err#25 ENOTTY
kioctl(3, 22528, 0x00000000, 0x00000000)        Err#25 ENOTTY
kread(3, " v e r s i o n = 2\r\n r".., 4096)    = 55
close(3)                                        = 0
kopen("test_ruleset.rb", O_RDONLY)              = 3
kioctl(3, 22528, 0x00000000, 0x00000000)        Err#25 ENOTTY
kioctl(3, 22528, 0x00000000, 0x00000000)        Err#25 ENOTTY
kread(3, " v e r s i o n = 2\r\n r".., 4096)    = 55
kread(3, " v e r s i o n = 2\r\n r".., 4096)    = 0
close(3)                                        = 0
kioctl(0, 22528, 0x00000000, 0x00000000)        Err#25 ENOTTY
kread(0, " 1 2 3   2 3 4   i r r e".., 4096)    = 19
kioctl(1, 22528, 0x00000000, 0x00000000)        = 0
{ "that": "234", "this": "123" }
kwrite(1, " {   " t h a t " :   " 2".., 33)     = 33
kread(0, " 1 2 3   2 3 4   i r r e".., 4096)    = 0
kfcntl(1, F_GETFL, 0x00000000)                  = 67110914
kfcntl(2, F_GETFL, 0x2FF22FFC)                  = 67110914
_exit(0)

# dos2unix test_ruleset.rb
dos2unix: converting file test_ruleset.rb to Unix format...

# echo "123 234 irrelevant" | truss lognormalizer -r test_ruleset.rb
execve("lognormalizer", 0x2FF22C34, 0x20015668)  argc: 3
read_sysconfig(0xF0785888, 0x00000010, 0x00000005, 0xD07C6500, 0xF02269A8, 0x00000008, 0x06010000, 0xF07F83C8) = 0x00000000
__loadx(0x0A040000, 0xD045E470, 0x00000008, 0x20000D64, 0x00000041) = 0x00000000
sbrk(0x00000000)                                = 0x20000EB0
vmgetinfo(0x2FF225F0, 7, 16)                    = 0
sbrk(0x00000000)                                = 0x20000EB0
__libc_sbrk(0x00000000)                         = 0x20000EB0
kopen("test_ruleset.rb", O_RDONLY)              = 3
kioctl(3, 22528, 0x00000000, 0x00000000)        Err#25 ENOTTY
kioctl(3, 22528, 0x00000000, 0x00000000)        Err#25 ENOTTY
kread(3, " v e r s i o n = 2\n r u".., 4096)    = 53
kread(3, " v e r s i o n = 2\n r u".., 4096)    = 0
close(3)                                        = 0
    Received signal #11, SIGSEGV [default]
*** process killed ***

dbx output from the core:

# dbx lognormalizer
Type 'help' for help.
[using memory image in core]
reading symbolic information ...

Segmentation fault in asprintf at 0xd5b3ba20
0xd5b3ba20 (asprintf+0x8) 800c0000            lwz   r0,0x0(r12)

(dbx) where
asprintf() at 0xd5b3ba20
unnamed block in ln_pdagComponentSetIDs(ctx = 0x20000eb8, dag = 0x20000f08, prefix = ""), line 501 in "pdag.c"
unnamed block in ln_pdagComponentSetIDs(ctx = 0x20000eb8, dag = 0x20000f08, prefix = ""), line 501 in "pdag.c"
ln_pdagComponentSetIDs(ctx = 0x20000eb8, dag = 0x20000f08, prefix = ""), line 501 in "pdag.c"
ln_pdagOptimize(ctx = 0x20000eb8), line 529 in "pdag.c"
ln_sampLoad(ctx = 0x20000eb8, file = "test_ruleset.rb"), line 1165 in "samp.c"
ln_loadSamples(ctx = 0x20000eb8, file = "test_ruleset.rb"), line 167 in "liblognorm.c"
main(argc = 3, argv = 0x2ff22c28), line 493 in "lognormalizer.c"

(dbx) print prs
0x20002288 

(dbx) print prs->name
"this" 

 (dbx) print &id
0x2ff22a00 

(dbx) print prefix
"" 

(dbx) print id
(nil) 

(dbx) print prs->prsid
'^D' 
StrongestNumber9 commented 2 years ago

Note: Using the compat asprintf as the OS doesn't provide one.

# dump -Tv liblognorm.so.5 | grep -i asprintf
[389]   0x00000000    undef      IMP     DS EXTref        [noIMid] asprintf
StrongestNumber9 commented 2 years ago

For whatever reason lognormalizer binary seems to work if I do extra null check but I have no idea what are the long term effects when used in rsyslog for example:

# git diff
diff --git a/src/pdag.c b/src/pdag.c
index 44a3847..9fb2cc2 100644
--- a/src/pdag.c
+++ b/src/pdag.c
@@ -497,10 +497,12 @@ ln_pdagComponentSetIDs(ln_ctx ctx, struct ln_pdag *const dag, const char *prefix
                                        goto done;
                        }
                } else {
+                 if(prs->name == NULL) {
                        if(asprintf(&id, "%s%%%s:%s%%", prefix,
                                prs->name ? prs->name : "-",
                                parserName(prs->prsid)) == -1)
                                        goto done;
+                 }
                }
                ln_pdagComponentSetIDs(ctx, prs->node, id);
                free(id);