Therefore, the current signature failed to find valid YAFFS images during tests, both with generated test images and in a real world scientific use-case examining an Android device NAND dump.
I propose the signatures in the attached commit, where we recognize the start of an object header defined by yaffs_obj_hdr, with the values being encoded depending on platform endianess:
u32 type /* enum yaffs_obj_type, valid 1-5 */
u32 parent_obj_id; /* 1 for root objects we recognize */
u16 sum_no_longer_used; /* checksum of name. Not used by YAFFS and memset to 0xFF */
YCHAR name[YAFFS_MAX_NAME_LENGTH + 1];
Notes:
mkyaffsimage always writes a root directory with empty name, then processing the target directory contents.
mkyaffs2image directly proceeds to writing entries with the appropriate u32 YAFFS_OBJECT_TYPE (1-5 valid), each with parent id
From a test set of 9 images generated with different contents and versions of the reference implementation, the old signature recognized 5, while the improved signature recognized all images and displayed additional data where appropriate (root file name). Attached for reference are the test images, as well as the old and new logs generated when executing binwalk directly on these files.
Various remaining parameters (NAND layout, ECC, etc.) do not seem to have an effect on the object header examined here.
Correct execution could also be verified with the device dump in question.
The current, simple YAFFS signature has several flaws, and a regression in not being able to detect yaffs2 images (read: images created with mkyaffs2image tool) since https://github.com/ReFirmLabs/binwalk/commit/46d8a3231d47002cc6d02837d207110858b70cd3.
Therefore, the current signature failed to find valid YAFFS images during tests, both with generated test images and in a real world scientific use-case examining an Android device NAND dump.
Unfortunately, the YAFFS on-disk format is poorly documented and mainly defined by the memory layout of the reference implementation found here: http://www.aleph1.co.uk/gitweb/?p=yaffs2.git;a=blob;f=yaffs_guts.h;h=74ded0be526f1f44c91ce90a6d54cc52bb338cf0;hb=HEAD#l329
I propose the signatures in the attached commit, where we recognize the start of an object header defined by yaffs_obj_hdr, with the values being encoded depending on platform endianess:
Notes:
From a test set of 9 images generated with different contents and versions of the reference implementation, the old signature recognized 5, while the improved signature recognized all images and displayed additional data where appropriate (root file name). Attached for reference are the test images, as well as the old and new logs generated when executing binwalk directly on these files.
Various remaining parameters (NAND layout, ECC, etc.) do not seem to have an effect on the object header examined here. Correct execution could also be verified with the device dump in question.
binwalk_old.log binwalk_new.log testimages.tar.gz