Vector35 / warp

Common format for transferring and applying function information across binary analysis tools
Other
11 stars 0 forks source link

Add versioning to signatures #23

Open williballenthin opened 4 days ago

williballenthin commented 4 days ago

As I read the FB file format, I saw that there are still some areas for improvement (naturally!) yet the format doesn't seem to have a version field. Has this been considered yet? I'm afraid if it hasn't then it will be challenging to know which signatures can be used with which Warp implementation (of course today there is just one, but...).

emesare commented 4 days ago

Originally we had a version tag for each entry, I am deciding where to actually put the version. Storing it for each individual Function is probably where it should be added, and was where it was originally.

https://github.com/Vector35/warp/blob/bae737907cc618569480d82cbd0c705eb8a2486b/signature.fbs#L33-L39

If we want to version just the file then would be enough to add it here

https://github.com/Vector35/warp/blob/bae737907cc618569480d82cbd0c705eb8a2486b/signature.fbs#L41-L44

It should be noted that the current files are compressed and the same idea could be applied there (i.e. to use the version id to denote compressed vs uncompressed):

https://github.com/Vector35/warp/blob/bae737907cc618569480d82cbd0c705eb8a2486b/rust/signature.rs#L120-L121

With that being said, the easiest thing to do would be to add a version field to the Function table. If we want to version whole files at that point I think it would also be a good idea to nest the Data table in another table that describes the transforms on the raw bytes (i.e. is it compressed, and if so, what compression algorithm).

table File {
    version:FileVersion;
    compress:CompressionType;
    data:[ubyte];
}
emesare commented 4 days ago

I'm afraid if it hasn't then it will be challenging to know which signatures can be used with which Warp implementation

That is an interesting view on the use of WARP, currently our goal is to align the signatures for use in all tools. But as more things come up it seems as though that is possibly a pipe dream (at-least for a subset of use cases).

Following that train of thought it might be beneficial to not only label a simple version, but provide some way to say "What did we mask", an example:

Say that a tool cannot identify effective NOP's (i.e. they cannot mask setting a register to itself) they would be unable to use signatures that did apply at least once, that mask operation, but they would totally be fine using any other signature that never made use of that operation.

If we had a bitflag field where each distinct mask operation sets a bit we could quickly filter out signatures that are unsupported by the consumer integration. I am not exactly sure how this would work when we start networking signatures, I would expect some handshake to occur that would give the supported operations to the server as a filter.

Anyways this is just a thought seperate from adding a version field.