DGA-MI-SSI / YaCo

YaCo is an Hex-Rays IDA plugin. When enabled, multiple users can work simultaneously on the same binary. Any modification done by any user is synchronized through git version control.
GNU General Public License v3.0
313 stars 36 forks source link

Can we have several idbs in one git repo? #44

Open saidelike opened 6 years ago

saidelike commented 6 years ago

Hi,

Sorry for the dumb question. From my testing, the answer is no and afaict we have to create one git repo per idb? Is that correct?

I tried to add a 2nd idb in the same folder and it looks like it imported the comments from the first idb.

I also tried to create a subfolder in the original git repo and to use an "empty" path when it asks for the git repo and it looks like it created another .git folder in the subfolder as if it wasn't already in a git repo already.

Thanks,

bamiaux commented 6 years ago

No, only one idb per repository is currently supported. However, adding support for multiple idbs is not hard to do, we just need to split the cache directory. The unfortunate side effect is that we will break backwards compatibility. As for adding a subfolder to an existing repo, it's not supported yet, but could be added too

saidelike commented 6 years ago

OK thanks for confirming.

Ya the subfolder thing was a test I tried but it may not be the most urgent. It would allow YaCo to work on a git repo used for other stuff too but right now we could just have the main git repo and a git submodule for IDBs with all of them in that git submodule. It would still be better than having one git repo per IDB and it would also allow not having the YaCo commits in mixed with the ones of the main repo.

I agree with the backwards compatibility problem but I feel it would really be a useful feature.

I am not sure if that would work or how it is handled atm but let's say for now the path of the IDB is not stored because there is no need to (since there is only one and the idb name is the same as the original with "_local" appended ), we could have a special case in the YaCo plugin code where: if the IDB path is not stored, then we know it is the old format, otherwise it is the new format).

The new format could be a hierarchy of files like:

where the path.txt contains what idb path links to what 1,2,3 folder, etc.

instead of

bamiaux commented 6 years ago

I believe there is a simpler way

Backwards compatibility should be preserved And maybe it's time to add inside the cache directory a small json file containing the version number

bamiaux commented 6 years ago

Actually, sad news, using multiple idbs in one directory is very complex, because you would not be able to do the whole commit/fetch/rebase/push independently per base. When editing one IDB, you would be forced to fetch commits from the other IDB without being able to apply the changes. Maybe something is possible using git tags, but there are a lots of gotchas everywhere

saidelike commented 6 years ago

Oh right good point.

Afaict branches would be more appropriate than tags. See explanation here.

We would need to create a branch (e.g using the same name as the idb name) for each idb.

git checkout -b <branch_name>

One problem I can think of already is since each idb would only exist in one branch, it means we would not be able to analyze two idbs at the same time (as other idbs files from the one we want to analyze/open will "disappear" each time we switch branches). If that is not a constraint, then it would be quite simple as we would just need to switch to the branch before the commit

git checkout <branch_name>

and then afaict we can commit, fetch, rebase and push and it will be done for this specific branch only.

But ya since files "disappear", it means we would still need to commit at least all idb once in the master branch so new users can see them by default, open them so they create their own copy of the _local.idb file. Then they don't need to checkout master anymore but they would need to checkout master for every new idb created that they haven't created a _local.idb yet.

Annoying :(

bamiaux commented 6 years ago

You cannot have multiple branches at the same time, but you can have multiple tags tracking the latest known state of each IDB cache. Instead of reading files directly, maybe something could be done by asking the diff between the tag and the HEAD instead. That's why I mentioned tags. When you pull changes from the server, you may get changes on the other database. Its tag doesn't move so we still know we will have to replay it