ThomasDickey / original-mawk

bug-reports for mawk (originally on GoogleCode)
http://invisible-island.net/mawk/mawk.html
17 stars 2 forks source link

Bug in regex compilation when regex looks like a character class and it has a non-ASCII byte #29

Closed GoogleCodeExporter closed 7 years ago

GoogleCodeExporter commented 8 years ago
dualbus@hp ~/x/awk-bugs/mawk-FS
 % ~/x/awk-build/mawk/build/awk/mawk -Wversion
mawk 1.3.4 20141206
Copyright 2008-2013,2014, Thomas E. Dickey
Copyright 1991-1996,2014, Michael D. Brennan

random-funcs:       srandom/random
regex-funcs:        internal
compiled limits:
sprintf buffer      8192
maximum-integer     2147483647

dualbus@hp ~/x/awk-bugs/mawk-FS
 % ~/x/awk-build/mawk/build/awk/mawk -F$'0[[:\303]'
zsh: segmentation fault  ~/x/awk-build/mawk/build/awk/mawk -F$'0[[:\303]'

dualbus@hp ~/x/awk-bugs/mawk-FS
 % /usr/bin/mawk -Wversion
mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan

compiled limits:
max NF             32767
sprintf buffer      2040

dualbus@hp ~/x/awk-bugs/mawk-FS
 % /usr/bin/mawk -F$'0[[:\303]'

dualbus@hp ~/x/awk-bugs/mawk-FS
 % 

Depending on what byte is used, the error message is different:
bash: line 1: 13379 Segmentation fault      ~/x/awk-build/mawk/build/awk/mawk 
-F"000[[:$bt]"
*** Error in `/home/dualbus/x/awk-build/mawk/build/awk/mawk': realloc(): 
invalid next size: 0x0000000001d10a70 ***
bash: line 1: 13380 Aborted                 ~/x/awk-build/mawk/build/awk/mawk 
-F"000[[:$bt]"

The bytes that cause the issue are (decimal, ASCII):
195,200-255

The template is -F"0[[:$byte]"

Original issue reported on code.google.com by dual...@gmail.com on 16 Dec 2014 at 7:37

GoogleCodeExporter commented 8 years ago
Oh, forgot to add that this doesn't happen in the ancient mawk distributed with 
debian.

Original comment by dual...@gmail.com on 16 Dec 2014 at 7:39

GoogleCodeExporter commented 8 years ago
The ancient mawk doesn't try to do character classes.

Original comment by dic...@his.com on 17 Dec 2014 at 2:06

ThomasDickey commented 7 years ago

This was fixed earlier this year:

$  mawk -F$0[[:\303]
mawk: line 0: regular expression compile failed (bad class -- [], [^] or [)
$0[[:\?C3]