d3v1l401 / FindCrypt-Ghidra

IDA Pro's FindCrypt ported to Ghidra, with an updated and customizable signature database
https://d3vsite.org/
GNU General Public License v3.0
520 stars 51 forks source link

Issue with detecting crypto algorithms #7

Closed pawlos closed 4 years ago

pawlos commented 4 years ago

I was playing a bit with the plugin and found another case/issue. I did poked around in the plugin code and database and I'm not sure how it supposed to work. I have a binary that utilized SALSA20 and Blake2. I'm seeing the IV constants in the code image and although FindCrypt has them defined in the DB the plugin doesn't report anything.

What I noticed that the BLAKE2 (or any other) constants are passed to findBytes as one 64-byte long array and of course since they are not in one place they are not correctly discovered. I get that it detects the crypto if those constants are stored in one locations as a continuous bytes.

Was that done on purpose? I think it would be better to search for separate const values but that would probably report same algo in multiple - closely located - addresses. To mitigate that those could be compacted if for example they are withing one function scope.

Alternatively, each array could be also represented as a separate consts values but that would bloat the DB.

Would that be ok, if I try to come up with updated DB and script to cover those those cases and send PR? Any preferred solution?

d3v1l401 commented 4 years ago

I will look into it too whenever I have some time, but if you'd feel like contributing, you're more than welcome to help me :)

Looking forward for your solution.

It was actually a suspect of mine that in some cases scattered byte patterns would not be found, it would be a good idea as well to fragment patterns by a certain amount of bytes and search single ranges and combining sequential results.

pawlos commented 4 years ago

Thanks for replay. I was actually started working on that but taking a bit more engineering approach so starting from some design and not jumping into the code too soon :).

Firstly, I wanted to have ability to easily test so I've modified the script to be able to run it in headless mode so that I can verify if my modification of the script did not break the existing functionality.

So to sum up, I'm working on this ticket (you can assign me if you want) it but also not sure how much time I will have due to the current situation.

d3v1l401 commented 4 years ago

I understand, we're all in a tough spot these last weeks, take your time and most importantly stay safe :)

pawlos commented 4 years ago

Just as a heads ups. I've got something working (I've have a dedicated branch if you want to have a look) but with the new mode it hits a lot of false positives so I'm trying to limit those.

Looking closer into the project I've few questions and maybe you could answer some of those:

  1. I've noticed another project in the sln - FCETreeExporter. Is it ok to get rid of this, or is this something useful and it was just not committed to the repo with the other project?
  2. Compression - does it work? Was it used? I think currently the DB was not using it - even before adding TEA_ALTERNATIVE consts.
d3v1l401 commented 4 years ago
  1. The project can be removed, it was a different using technique used to store onto the database, but would've made customization to complex, so I just removed it.
  2. Compression is there but I noticed sometimes it fails with certain constants, so I wanted to disable it by default.