NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.55k stars 5.87k forks source link

Loaders and FileSystems could be able to state which file extensions they support and the file dialog would include those in its filters #7128

Open hippietrail opened 6 days ago

hippietrail commented 6 days ago

Is your feature request related to a problem? Please describe. More of an enhancement but sometimes when investigating poorly understood files or systems you don't know all the file types that Ghidra understands. Looking through directories or archives for the executables or libraries would be easier if file extension filtering covers all the file types known, including those known by extensions.

Describe the solution you'd like Loaders and FileSystems would each have a new function they can optionally override to pass an array of file extensions. Optional because not all file types on all platforms make consistent of use of file extensions.

Describe alternatives you've considered Trying to remember all the file extensions, double-clicking at random on ones that look like they might have the code in them.

Additional context I'm willing to work on this. Would have to ask for guidance as usual.

It would depend on the order of construction of the components of Ghidra. If the Loaders and FileSystems are loaded/enumerated before the file dialog is prepared then this feature will be a lot easier to implement.

ryanmkurtz commented 6 days ago

I'm not sure how useful these file extension filters are, since so many files we work with have no extension, and we rely on magic bytes to determine what they actually are.

hippietrail commented 6 days ago

I'm not sure how useful these file extension filters are, since so many files we work with have no extension, and we rely on magic bytes to determine what they actually are.

Well yeah replacing them with code that calls the probe() method of the FileSystems and the findSupportedLoadSpecs() method of the Loaders would be better but also seems like a lot of work, and more CPU load. Then again the Amiga used to do that 30 years ago instead of using file extensions on far less powerful hardware.

My use case is that I'm developing a bunch of loaders and filesystems for old platforms many of which I don't know that well, and when I go looking for m68k binaries for those platforms to test the a-line and f-line trap support i'm working on in the the Processor Sleigh I don't always remember which files have code in them. It hides the irrelevant files given the caveat that I know file extensions are not a silver bullet.

The other option is just to ditch the file extension filtering since it doesn't always work?

ryanmkurtz commented 6 days ago

I will bring these options up to the team.

hippietrail commented 6 days ago

I will bring these options up to the team.

I'd happily tackle the file extension easy version. The hard version requires too much knowledge I don't have so would be up to the team. Not very high priority obviously. Thanks for the feedback.

ryanmkurtz commented 6 days ago

What's the easy version?

hippietrail commented 6 days ago

What's the easy version?

The one that only goes by filename extensions and leverages the existing code.

ryanmkurtz commented 5 days ago

We can rule out the possibility of the file dialog probing the bytes, that will indeed be too slow, especially on machines with aggressive antivirus products. My recommendation to the team will be to drop the filters, but if they want to keep them, I do like your idea of dynamically creating the set of extensions from every discovered Loader and GFileSystem.