Closed trou closed 6 years ago
I agree, and it is documented: this architecture is named #6502#cc65
instead of 6502
, in cpu_rec.py
you can read
# 6502 binary compiled with https://github.com/cc65/cc65 # This appears to be more compiler-dependent than CPU-dependent, the # statistics are very different from an AppleII ROM, for example.
and the paper published at SSTIC says: le code 6502 fabriqué par https://github.com/cc65/cc65 est caractéristique du compilateur plus que du CPU
If you can provide a sufficiently large amount of 6502 code that would be characteristic of this CPU, I can add it to the corpus.
The code from the Atredis challenge could be a good starting point (according to http://www.msreverseengineering.com/blog/2018/7/24/the-atredis-blackhat-2018-ctf-challenge it has some characteristic sequences of instructions, e.g. LDA #0 followed by RTS.
But if I add it (e.g. by copying https://raw.githubusercontent.com/RolfRolles/Atredis2018/master/MemoryDump/data-4000-efff.bin
in cpu_rec_corpus/#6502#Atredis.corpus
it does not recognize osi_bas.bin
as being 6502.
It is because this file contains too many non-code data: the text at its start and large chunks of zeroes.
Therefore you should extract from data-4000-efff.bin
the chunks containing 6502 code. But the resulting 6502 corpus is small, and probably not sufficient to characterize this cpu.
My criterion for being happy and adding a new architecture to the corpus is if I can learn this architecture on some file, and recognize this architecture in another file from a completely different source. The issue with 6502 is that my biggest source (the Apple II ROMs) are not free and therefore cannot be included in the published corpus.
It seems that https://raw.githubusercontent.com/RolfRolles/Atredis2018/master/MemoryDump/data-4000-efff.bin
contains only slightly more than 1300 bytes of 6502 code (starting at position 0x4000 in this file, which is the memory address 0x8000).
Nevertheless, I have added this data to the default corpus, under the name 6502
, because it is sufficient to recognise osi_bas.bin
and APPLE.ROM
as being 6502.
The result is:
Target File: corpus/6502/data-4000-efff.bin
MD5 Checksum: 827998bbc4a941b52b8e19b1f2724bd7
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
0 0x0 None (size=0x4000, entropy=0.043476)
16384 0x4000 6502 (size=0x400, entropy=0.691858)
17408 0x4400 None (size=0x6c00, entropy=0.018228)
Target File: corpus/6502/osi_bas/osi_bas.bin
MD5 Checksum: b331075b878624bfa65757677f01ea87
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
0 0x0 6502 (size=0x1c00, entropy=0.877842)
7168 0x1C00 None (size=0x2400, entropy=0.131216)
Target File: corpus/6502/APPLE.ROM
MD5 Checksum: 58ddc617555e2fc242b20e7f86165ab2
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
0 0x0 None (size=0x800, entropy=0.801389)
2048 0x800 6502 (size=0x3600, entropy=0.902971)
More 6502 sample should be useful, but at least with what you have provided, there is some 6502 recognition available.
https://www.von-bassewitz.de/cgi-bin/ftp-portal.pl?url=ftp://ftp.musoftware.de/pub/uz/cbm610/kernal610-orig.zip could be used as a sample. It's the original code running on a CBM610
Indeed, with the new data coming from the Atredis challenge, it is recognised as 6502.
Target File: corpus/6502/kernal610-orig/kernal.bin
MD5 Checksum: 5d6f6428ff1c2a58225a04092621c7b6
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
0 0x0 6502 (size=0xa00, entropy=0.861012)
2560 0xA00 None (size=0x400, entropy=0.704522)
3584 0xE00 6502 (size=0xc00, entropy=0.845523)
6656 0x1A00 None (size=0x600, entropy=0.808449)
Despite its small size, the Atredis code works suprisingly well to recognise 6502. My tests did not find any non-6502 code that is recognised as 6502, but there is a risk.
It fails to recognize the following files as 6502 code:
osi_bas.bin
from http://searle.hostei.com/grant/6502/osi_bas.zip