NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.23k stars 5.84k forks source link

C parser choking with vague error explanation #4440

Closed Schala closed 2 years ago

Schala commented 2 years ago

Describe the bug I'm trying to parse the headers of Apple's old Universal Interfaces version 3.4. Using Ghidra's supplied objc_mac_carbon.prf, it chokes on MacTypes.h with the following in application.log. The same error occurs when attempting to parse this with a custom profile set to Mac OS 9's Metrowerks CodeWarrior.

2022-07-13 11:05:37 WARN  (CParser) ghidra.app.util.cparser.C.TokenMgrError: Lexical error at line 98, column 23.  Encountered: "=" (61), after : "" ghidra.app.util.cparser.C.TokenMgrError: Lexical error at line 98, column 23.  Encountered: "=" (61), after : ""
    at ghidra.app.util.cparser.C.CParserTokenManager.getNextToken(CParserTokenManager.java:3894)
    at ghidra.app.util.cparser.C.CParser.jj_ntk(CParser.java:7869)
    at ghidra.app.util.cparser.C.CParser.PragmaSpecifier(CParser.java:1705)
    at ghidra.app.util.cparser.C.CParser.PragmaSpec(CParser.java:1689)
    at ghidra.app.util.cparser.C.CParser.ExternalDeclaration(CParser.java:768)
    at ghidra.app.util.cparser.C.CParser.TranslationUnit(CParser.java:750)
    at ghidra.app.util.cparser.C.CParser.parse(CParser.java:630)
    at ghidra.app.plugin.core.cparser.CParserPlugin.parse(CParserPlugin.java:384)
    at ghidra.app.plugin.core.cparser.CParserTask.run(CParserTask.java:70)
    at ghidra.util.task.Task.monitoredRun(Task.java:134)
    at ghidra.util.task.TaskRunner.lambda$startTaskThread$0(TaskRunner.java:106)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:830)

2022-07-13 11:05:37 INFO  (CParserTask$3) Parse Errors: C Parser:  Problem Parsing.
          in C:\Users\cmmeu\Downloads\Interfaces&Libraries\Interfaces\CIncludes\MacTypes.h near line 21
         Last Valid Datatype: - none -
         Check around CParserPlugin.out around line: 98

2022-07-13 11:05:37 ERROR (PackedDatabase) Failed to dispose PackedDatabase - it may still be in use!
C:\Users\cmmeu\AppData\Local\Temp\univint.gdt java.lang.Exception
    at ghidra.framework.store.db.PackedDatabase.dispose(PackedDatabase.java:331)
    at ghidra.program.model.data.FileDataTypeManager.close(FileDataTypeManager.java:233)
    at ghidra.app.plugin.core.cparser.CParserTask.run(CParserTask.java:140)
    at ghidra.util.task.Task.monitoredRun(Task.java:134)
    at ghidra.util.task.TaskRunner.lambda$startTaskThread$0(TaskRunner.java:106)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:830)

To Reproduce Steps to reproduce the behavior: With a copy of Apple Universal Interfaces version 3.4 (terms of use prohibit redistribution, but a copy came with my old CodeWarrior purchase), add the include path to objc_mac_carbon.prf and save it.

Expected behavior See the application.log output above

Attachments License of the offending file prohibits redistribution.

Environment (please complete the following information):

Schala commented 2 years ago

I forgot to mention, my custom profile has these compiler switches. The value of __MWERKS__ corresponds to CodeWarrior version 8.0.

-D__MWERKS__=0x3000
-Dmacintosh
-Dpowerc
-DTARGET_CARBON
emteere commented 2 years ago

What is in the MacTypes.h file around line 21? Does it look like normal Preprocessor macros, or C? If the file is C++, the current CParser cannot parse it.

Can you post the CParserPlugin.out lines up to about 150?

Schala commented 2 years ago

It's all C code, aside from a few files that have a few C++ class wrappers behind #ifdef __cplusplus clauses. I did take a look at ConditionalMacros.h though, which is used by MacTypes.h, and it has a few weird custom directives such as #system and #cpu as well as various occurrences of #pragma import or #pagma align(mac68k). Apparently the parser doesn't seem to ignore code in #ifdefs when the symbol is absent. I think that's the biggest issue.

image

From CParserPlugin.out, lines 329-335:

/// #if 1 ===true

 #pragma options align=mac68k

/// #else if 0 ===false
/// #else if 0 ===false
/// #endif ===false
emteere commented 2 years ago

So a simple quick fix is to make a copy of the header files and edit out the offending line, probably the "#pragma" line above and other compiler specific lines. You "might" be able to -D define things in the parser that will null out some lines. Sometimes even defining certain keywords can help, but not always.

The ifdefs are evaluated. The code in between would be ignored it the macro surrounding that line evaluated to 0, or was not defined. Any value other than zero in a macro evaluates to true.

Schala commented 2 years ago

I'll give that a try and edit this as I see fit