Open saadkadhi opened 11 years ago
Saad,
Thanks for catching these bugs! There’s definitely some ugly parsing code to handle all of the edge-case XML schema produced by some of the audits, and I really appreciate you taking the time to provide these fixes. I’ll roll your patches into an upcoming update – and I also have some longer-term plans to do more extensive refactoring. Hope you’ve found it useful now that it’s running without crashes!
-Ryan
From: Saad Kadhi [mailto:notifications@github.com] Sent: Wednesday, November 21, 2012 5:50 AM To: mandiant/AuditParser Subject: [AuditParser] Crash due to UTF-8 issues while processing Redline audit files generated on non-US Windows operating systems (#1)
Hi Ryan,
I've tried AuditParser against audit files generated by Redline 1.7 (comprehensive collector, default settings) generated on a non-US Windows operating system.
The program crashed with the following error while processing mir.w32apifiles:
Traceback (most recent call last): File "/tools/AuditParser/AuditParser.py", line 482, in main() File "/tools/AuditParser/AuditParser.py", line 472, in main else: parseXML(inFile,outFile) File "/tools/AuditParser/AuditParser.py", line 217, in parseXML writer.writerow(row) UnicodeEncodeError: 'ascii' codec can't encode characters in position 68-77: ordinal not in range(128)
There are a few row.append() instances that should use encode("utf-8"). Once fixed, the program runs smoothly until it hits mir.w32scripting-persistence. It dies with the following error:
Parsing input file: 20121109064423/mir.w32scripting-persistence.60254e2b.xml main() File "/tools/AuditParser/AuditParser.py.new", line 471, in main if (filename.find("persistence") > 0): parsePersistence(inFile, outFile) File "/tools/AuditParser/AuditParser.py.new", line 297, in parsePersistence row[i] = rowValue.encode("utf-8") UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 8: ordinal not in range(128)
This second issue is to due to an extraneous encode("utf-8") in line 250:
row.append(rowData.encode("utf-8"))
Once removed, AuditParser.py processes all files without a hiccup.
I've made a diff to fix the issues: http://pastebin.com/SmRKR6sR
Best Regards, Saad Kadhi (@_saadk)
— Reply to this email directly or view it on GitHub https://github.com/mandiant/AuditParser/issues/1 .
You are very welcome. And thanks to you for putting time and energy into releasing AuditParser. I am really glad to hear that you are going to continue improving & maintaining the code.
I am still encountering crashes but these are due to lxml (char out of range types of error generated when encountering non-printable chars). If you have already been bitten by this kind of edge cases, I'd be glad to hear how you solved them.
Cheers, Saad Kadhi (@_saadk)
Hi Ryan,
I've tried AuditParser against audit files generated by Redline 1.7 (comprehensive collector, default settings) generated on a non-US Windows operating system.
The program crashed with the following error while processing mir.w32apifiles:
There are a few
row.append()
instances that should useencode("utf-8")
. Once fixed, the program runs smoothly until it hits mir.w32scripting-persistence. It dies with the following error:This second issue is to due to an extraneous
encode("utf-8")
in line 250:Once removed, AuditParser.py processes all files without a hiccup.
I've made a diff to fix the issues: http://pastebin.com/SmRKR6sR
Best Regards, Saad Kadhi (@_saadk)