NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
52.16k stars 5.91k forks source link

DumpFileLoader: IndexOutOfBoundsException #4353

Closed astrelsky closed 2 years ago

astrelsky commented 2 years ago

Describe the bug I hope it's not too soon to open an issue for this as it was recently added. Anyway, most of the time loading a dump file from procdump results in an IndexOutOfBoundsException.

To Reproduce Steps to reproduce the behavior:

  1. Try to import attached dump of demangler_gnu_v2_33_1 (couldn't find the decompiler process and it wouldn't let me do Ghidra)

Attachments

stacktrace ``` 2022-06-16 19:33:50 ERROR (ImporterUtilities) Error Importing File: Error importing file: demangler_gnu_v2_33_1.exe_220616_193343.dmp java.lang.IndexOutOfBoundsException: Specified length extends beyond file bytes length at ghidra.program.database.mem.MemoryMapDB.checkFileBytesRange(MemoryMapDB.java:697) at ghidra.program.database.mem.MemoryMapDB.createInitializedBlock(MemoryMapDB.java:660) at ghidra.app.util.MemoryBlockUtils.createInitializedBlock(MemoryBlockUtils.java:227) at ghidra.file.formats.dump.DumpFileLoader.loadRanges(DumpFileLoader.java:183) at ghidra.file.formats.dump.DumpFileLoader.parseDumpFile(DumpFileLoader.java:150) at ghidra.file.formats.dump.DumpFileLoader.load(DumpFileLoader.java:122) at ghidra.app.util.opinion.AbstractLibrarySupportLoader.doLoad(AbstractLibrarySupportLoader.java:356) at ghidra.app.util.opinion.AbstractLibrarySupportLoader.loadProgram(AbstractLibrarySupportLoader.java:84) at ghidra.app.util.opinion.AbstractProgramLoader.load(AbstractProgramLoader.java:115) at ghidra.plugin.importer.ImporterUtilities.importSingleFile(ImporterUtilities.java:368) at ghidra.plugin.importer.ImporterDialog.lambda$okCallback$7(ImporterDialog.java:351) at ghidra.util.task.TaskBuilder$TaskBuilderTask.run(TaskBuilder.java:306) at ghidra.util.task.Task.monitoredRun(Task.java:134) at ghidra.util.task.TaskRunner.lambda$startTaskThread$0(TaskRunner.java:106) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) ```

demangler_gnu_v2_33_1.exe_220616_193343.zip

Environment (please complete the following information):

Additional context I do have more, mostly enhancements I think, related to the new loaders. I don't want to open a ton of things that might just be making their way through the pipeline though.

d-millar commented 2 years ago

@astrelsky Am pretty sure that’s the same bug @ghidra1 had me fix today - will check tomorrow to be sure, but the fix is on its way. :)

astrelsky commented 2 years ago

@astrelsky Am pretty sure that’s the same bug @ghidra1 had me fix today - will check tomorrow to be sure, but the fix is on its way. :)

Ok great ty.

d-millar commented 2 years ago

OK, I lied - this was a new bug. Am putting in the fix now. If you need it right away, add IndexOutOfBoundsException to the catch on line 197 of DumpFileLoader::loadRanges.

astrelsky commented 2 years ago

OK, I lied - this was a new bug. Am putting in the fix now. If you need it right away, add IndexOutOfBoundsException to the catch on line 197 of DumpFileLoader::loadRanges.

Without looking this is probably what I did. I just added it to all the caught and logged exceptions for now.

d-millar commented 2 years ago

Yep - and that's exactly what the fix will be. If it's asking to add an image that exceeds the file length, we're just going to punt on that image.

astrelsky commented 2 years ago

Yep - and that's exactly what the fix will be. If it's asking to add an image that exceeds the file length, we're just going to punt on that image.

I have only had this not occur once. Then again I've only loaded about 4 or 5 dumps. I'm not saying something is necessarily wrong, it's just usual. (Yes usual, because it's a Microsoft format...)

d-millar commented 2 years ago

I think that might be OK - after your fix, did the remaining images load? I.e. I think, for anything less than a full dump, it’s likely some fraction of the libraries in the process tree will not be represented completely in the dump. So, if we punt on those, as long as the remainder are present, we’re doing the best we can. That said, I wouldn’t rule out a mistake on my part. Will re-check.

astrelsky commented 2 years ago

I think that might be OK - after your fix, did the remaining images load? I.e. I think, for anything less than a full dump, it’s likely some fraction of the libraries in the process tree will not be represented completely in the dump. So, if we punt on those, as long as the remainder are present, we’re doing the best we can. That said, I wouldn’t rule out a mistake on my part. Will re-check.

It did load and things seemed ok I think. I have encountered this on full dumps as well. I don't actually think anything is wrong, it's just strange.

d-millar commented 2 years ago

Revisited and (surprise) definitely an error on my part. Not doing the right thing for the CommentStreams (which were missing in my test examples), resulting in bad blocks that exceeded the file bounds. Fix in the queue....

astrelsky commented 2 years ago

@d-millar it turns out that the things I wanted to open issues for (as enhancements) were already there just hidden away in the loader options. Perhaps the Analyze Embedded Executables (interactive) option should be enabled by default? With it disabled you're pretty much left flying blind for all imported function calls.

ghidra1 commented 2 years ago

@astrelsky Making a change that will enable Analyze Embedded Executables (interactive) option by default for headed use. This option currently requires user interaction to select modules of interest via a pop-up dialog. If we were to support this option for headless use it would have to analyze ALL modules - although I will defer to @d-millar to determine if that would be reasonable to allow.

astrelsky commented 2 years ago

@astrelsky Making a change that will enable Analyze Embedded Executables (interactive) option by default for headed use. This option currently requires user interaction to select modules of interest via a pop-up dialog. If we were to support this option for headless use it would have to analyze ALL modules - although I will defer to @d-millar to determine if that would be reasonable to allow.

It made sense to me to select everything except the dump headers and memory sections. I do foresee situations where that wouldn't be the case but I'd expect those to be the edge cases and not the other way around.

d-millar commented 2 years ago

@ghidra1 has made the changes you suggested, i.e made the analyze option default to on for the non-headless mode. For full kernel dumps, you almost certainly do not want to select everything. You’ll be waiting for hours for processing to complete.

astrelsky commented 2 years ago

@ghidra1 has made the changes you suggested, i.e made the analyze option default to on for the non-headless mode. For full kernel dumps, you almost certainly do not want to select everything. You’ll be waiting for hours for processing to complete.

I'm sure it would take a while. I don't know anything about the format but would it be possible and/or reasonable to have what is enabled by default for analysis different for a mini user mode dump vs a full dump? Just a thought, I'm certainly happy with what I've got.

d-millar commented 2 years ago

Entirely possible, but not exactly sure what would be different. Are you thinking the analysis option would be on for all modules in headless mode for some variants?

astrelsky commented 2 years ago

Entirely possible, but not exactly sure what would be different. Are you thinking the analysis option would be on for all modules in headless mode for some variants?

Considering how the option api is laid out it might be problematic if the default settings are different depending on the file. I haven't taken a look yet and am assuming that it is the usual options class being used that works like a map. From both the command line and from a headless script how would the user know what they have to enable/disable or will this always require some user input at runtime?

My thoughts are beginning to mash together. I should shut up and call it a night. :sweat_smile:

d-millar commented 2 years ago

@ghidra1 has tweaked the options so they CAN be set per-dump-type now, so I feel like we can do whatever we feel is best. I just don’t have a good sense of what actually would be best. I don’t want to set the default to something most users won’t tolerate, and making a decision based on some arbitrary size cutoff feels a bit dodgy.