NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
52.05k stars 5.9k forks source link

Merging a backup *.keep file into a versioned buffer file #4979

Closed Kevinn1109 closed 1 year ago

Kevinn1109 commented 1 year ago

I am currently in the process of decompiling a large binary file in a combined effort, for which we use the built in version control. However, one major disadvantage of version control so far is that it's very inflexible. Anything that does not follow the main path (open > edit > save) cannot be included in it as far as I know. Normally that is not a major issue, but unfortunately that changed drastically.

Today the version controlled file had to be hijacked and rechecked-out for an update, but when I took the steps to do so, Ghidra understandably reset the versioned file back to the latest one on the server. It also created a *.keep file with all my changes, which I thought was fine. Since I already have experience with the inflexibility of the version control I already started having second guesses after it was already too late, and in the hour that followed I just could not find any way to get my uncomitted changes back into the version controlled buffer. It's as if git did not have the possibility to pop a stash.

Is there any method to diff the original checked out version with the latest local version so that all changes in this local version are added to the version control? So far my best results came from the built in Version Tracking tool, but that tool seems to be a bit too limited for this specific purpose, since I also need struct and function variable edits to come through. I have also experimented with altering the contents of the idata folder in the repository, but so far it seems that the map change and data change files are required for the version control to work and also cannot be generated manually.

If there's not such a thing as a manual merge tool, I'd like to propose it. A modern version control system should not be limited to just editing the file itself, it should also support external changes. This would make working in teams a lot simpler.

ghidra1 commented 1 year ago

There is a Listing Diff tool which is part of the CodeBrowser which can be used to merge markup differences into a target program. Hopefully you are not contending with other changes to the versioned program because there is no "conflict" handling. In addition, only data types referenced by diff-merged functions and data code units will be copied, although I don't recall if it will produce .conflict types in the process. The Diff can be intiated with the target program active by using the Listing toolbar button (see Help content for more details).

Kevinn1109 commented 1 year ago

Backup files are always being produced as you make changes to allow recovery from a hard failure (i.e., termination) of the Ghidra application. When such a file is present, re-opening the file will ask if you would like to restore.

I do not recall being asked about restoring a backup file when opening the main file. I was asked if I wanted to create a backup file when I performed the hijack, and that's the file I have been trying to merge back into the main file. This could be explained by the fact that I switched to a new version of Ghidra in the process though, since the entire application was reset by the update. The repository itself was left the same before opening it in the new version. So if the repository is supposed to trigger the prompt, it did not work or it's overruled by the prompt asking to update the repository.

ghidra1 commented 1 year ago

What do you mean by "performed hijack"? We only have an "undo hijack" when one occurs, since the private hijack file is hiding the repository file with the same name.

A hijack state simply occurs from one of the these actions:

  1. A local project, while not connected to the Ghidra Server, creates a local project file that happens to conflict with one in the server repository. When you connect to the repository, the private file would then appear as a hijacked file.
  2. A checkout is forceably terminated from the View Checkouts display by a repository Admin.

A hijack condition should never occur when a file is checked-out within your project. If a checked-out file transitions to a hijack state that is unexpected unless the checkout was forceably terminated by a repository admin or some other bad state was induced.

ghidra1 commented 1 year ago

I removed my comment regarding .keep files. I mixed this up with the restore files. A .keep files is created at the request of the user when performing an Undo Checkout - the user intentionally terminates their checkout and is asked if they would like to keep a copy of there current private file. There is no way to incorporate changes from an abandoned checkout other than the use of Diff between a new checkout and the .keep file which may have been retained.

Kevinn1109 commented 1 year ago

What do you mean by "performed hijack"? We only have an "undo hijack" when one occurs, since the private hijack file is hiding the repository file with the same name.

I meant with that the steps to obtain an exclusive checkout. Others will see "hijacked", so I made the assumption that that also was the term for it.

The listing merge seems to have done the job, thanks for the help with that. Would some kind of automated merger be possible? My initial idea was to drag the keep file onto the main file in the main window, which did nothing. This action could then potentially open a dialog where some settings can be modified before a full merge happens behind the scenes. An alternative could also be to have the option to generate restore files, though I'm not sure if that would convert to a new Ghidra version.

ghidra1 commented 1 year ago

You can not get an exclusive checkout if others have the same file checked-out. Other user's would only get a hijacked if you forceably terminated their checkout which should be avoided if at all possible. You do not want to induce a hijack condition for another user since they could loose their work.

No automatic merge for Diff since it is lossy and potentially error prone. It is easy enough if Diff to select all differences and apply them if that is what a user wants.

There is no way to safely recover versioned changes which are lost due to a terminated checkout and induced hijack which would result. Exclusive checkouts exist for the reason that merge cannot support a class of changes that require an exclusive checkout (e.g., memory changes, upgrades, etc.).