Closed balki closed 1 year ago
Yes, this is slightly tricky, isn't it? I'd rather not introduce new syntax and range pattern types above and beyond POSIX here, so I'd suggest not using a range pattern for this, but two patterns with a flag. Similar to your endcond
solution but a bit simpler (and one line :-).
You have to be careful with the order of the patterns, putting the /^ts:/ { e=0 }
pattern-action first, so that the /Got Excep/ { e=1 }
sets e
to 1 for that first line before the e { print }
pattern is evaluated, and the "Got Exception" line is printed:
$ goawk '/^ts:/ { e=0 } /Got Excep/ { e=1 } e { print }' data.txt
ts:Jan 11 12:16:33 ERROR Got Exception in module foo
1. Traceback (most recent call last):
1. File "/tmp/teste.py", line 9, in <module>
1. run_my_stuff()
1. NameError: name 'run_my_stufff' is not defined
ts:Jan 11 12:17:33 ERROR Got Exception in module foo
2. Traceback (most recent call last):
2. File "/tmp/teste.py", line 9, in <module>
2. run_my_stuff()
2. NameError: name 'run_my_stufff' is not defined
You can even shorten it slightly more by dropping the { print }
on the last pattern, as that's the default:
$ goawk '/^ts:/ { e=0 } /Got Excep/ { e=1 } e' data.txt
The Gawk manual also has a couple of examples for range patterns that might be useful (though they don't quite fit what you're doing here).
Hope that helps!
Thanks! Though not obvious at first glance, yet concise and clear.
$ goawk '/^ts:/ { e=0 } /Got Excep/ { e=1 } e' data.txt
Example log file
I am trying to Extract the error line
Got Exception in module foo
along with the following traceback.First attempt:
This does not work because the end range expression
/^ts:/
, also matches the error line, so the range begins and ends with the single line. There is no easy way to match the last line of the exception or the next log line. Finally found a working solution but it is no longer an one-liner and is not straightforward to understand.Solution:
Can we have a command line flag or special syntax such that end pattern is not checked if it is the first line in the range? e.g.
or use double comma (,,) to enable this behavior. This is currently a syntax error, so should be backwards compatible.