Closed criminosis closed 1 year ago
@bojand Put up my PR for your consideration https://github.com/bojand/infer/pull/87
Again I'm not sure what the original Java matcher was for, but happy to put it back as an additional check if that was known to match a particular case.
If you're good with the change and merge it mind cutting a new release of Infer
with it?
I was running infer over some Java class files and wasn't getting a hit as Java, but instead as
application/x-mach-binary
. Setting aside the collision of magic byte with Mach-O's definition aside for the moment, I'm not sure where the Java magic byte is coming from?According to the spec Java class files start with
0xCAFEBABE
. I tried looking into the history of where Infer's current Java magic byte came from but it looks like it has been there since the initial commit to this repo and there didn't seem any more context nor other open issues.This does mean a possible corrected Java matcher would collide with Mach-O's matcher.
@bojand I'd be happy to put up a PR to fix both issues if there's appetite for it. This comment in the Mach-O matcher is already referencing a post detailing how
libmagic
gets around this magic byte collision.After the common magic byte, Java devotes 2 bytes to a minor version and then 2 bytes to a major version of the class file. Major versions start at 45, versions less than 45 are pre Java 1.1 and presumed from its pre-historic Oak period.
Mach-O devotes the full 4 bytes to specifying the number of multi-arch entries in the "fat" file. There's 18 defined archetypes, AFAIK, that a Mach-O archive could contain to at this time.
Given new widespread CPU architectures are few and far between nowadays the comment from
libmagic
seems like a reasonable "hack" here to discriminate between the two:0xCAFEBABE
.Fwiw it seemed like Infer was also lacking a Java class file test case, so I'd add that for extra confirmation in my PR. It looks like it already has some Mach-O samples for testing.