Closed PeterPawn closed 3 years ago
I'm a bit unsure, whether this is the correct repository for this issue
This is indeed the correct repository: FACT uses the "dockerized" version of the extractor (this was done to make the installation process quicker and easier).
I'm not providing a patch for your file(s), because the if-then construct in avm_kernel_image.py looks odd to me, too. I can't see, why you want to unpack the contained kernel only, if the present file does not contain a SquashFS image - this makes only sense, if you're calling this function recursively and the splitted kernel image from the first call is unpacked with a second call later.
I did not write this and needed to work through the code myself, but it indeed seems to be intended to work recursively (the MIME signature matches the kernel part again which is then unpacked in the next round of recursive unpacking). I agree that this is a bit obscure and could be done more clearly.
I could confirm that the -scan
parameter works with some of the images that the extractor cannot unpack at the moment. I will try to integrate it into the extractor in a sensible way.
Thank you for your input!
I was able to unpack the Images of the 4020 and 7390 successfully in FACT with the changes in #66. Changes to avm_kernel_image.py
don't seem to be necessary, because the "SquashFS part" also gets unpacked with squash_fs.py
after being split up. (Mind that FACT uses a stable version of the extractor by default and this change only takes effect when a new stable version of the extractor is released, but we wanted to do that soon anyway)
I'm a bit unsure, whether this is the correct repository for this issue - while the description states:
it seems at the same time the only repository, which contains code dedicated to unpack the various firmware formats.
Nonetheless I'll try to show/explain here, why your attempts to unpack/analyze the firmware for AVM's model 4020 got failed.
If a device model by AVM uses a "combined image" for the firmware, it consists of a kernel image, immediately followed by a filesystem image using SquashFS format. For version 4 of SquashFS, AVM has changed the official format (which uses only "little endian" byte order anymore) to an own, where some data is still stored with "big endian" byte order, if the platform uses BE storage order.
If the platform of the device is a MIPS processor, the SquashFS image isn't stored as one continuous data stream - it contains a gap at an offset, that will be
loadedmapped to physical memory address 0xC00000, where the NMI vectors will be looked up in a "very basic state" of processor initialization. The size of this gap varies with the processor and its architecture. If the loader size in flash memory is 0x20000 (and the kernel/filesystem partition starts after the loader partition), this gap will be found in the (single) filekernel.image
at offset 0x00BE0000. (EDIT: The load address doesn't really matter, see my earlier post here: https://www.ip-phone-forum.de/threads/%C3%9Cbersicht-von-fritz-boxen-mit-junk-bytes-im-squashfs-image.286318/)According to your file
avm_kernel_image.py
(https://github.com/fkie-cad/fact_extractor/blob/master/fact_extractor/plugins/unpacking/avm_kernel_image/code/avm_kernel_image.py#L26) you're trying to split these images into the kernel part and take the whole rest as filesystem image (that's howfind-squashfs
works). If the filesystem part doesn't contain the NMI vector gap, everything works as expected - but if the SquashFS image contains this gap, the (later) extraction process for the SquashFS image will fail.There are two options to handle this case correctly ... either you let remove the NMI vector gap from the extracted filesystem image (see this shell script from Freetz project: https://github.com/Freetz/freetz/blob/master/tools/remove-nmi-vector) or you use an extension to the
unsquashfs
binary (you're using the proper sources already and the files copied from your Freetz container during installation support these extensions) and unpack the SquashFS data directly from thekernel.image
file:Because the new option
-scan
does not affect any "pure" SquashFS image, it doesn't matter, whether it's always used to search for the SquashFS superblock - you may unpack "plain" SquashFS images, too, while using this option.Using the option
-scan
, the superblock offset is determined first and then the existence of the NMI vector gap is checked. If the NMI vector gap is present, it will be skipped while reading/unpacking files from this image. But this is checked/done only, if the new option was specified while calling the tool.Even if the Freetz implementation (https://github.com/Freetz/freetz/commit/ba45d885189f3284c0eeb2ff13215bfbda2650c2) differs slightly from my own (https://github.com/PeterPawn/YourFreetz/commit/9f89c498caef84fa8cc7c64730c670a71637b43f), both serve the same result - and you should be able now to unpack the 4020 firmware, too.
And by the way ... this is the same procedure for all FRITZ!Box models, which are using a MIPS architecture and this "single image format" for its software - usually these devices have NOR or SPI flash only, because with NAND flash the firmware structure is a different one.
The 4020 was the only model with this structure in your portfolio - otherwise you would have problems unpacking the firmware for other AVM models, too. Try the 7390 firmware (it's still using SquashFS3 format) or 7360v2 (this is a SquashFS4 image with BE byte order) as other examples of these MIPS-devices with NMI vector gap ... if you want to enhance/verify/test your unpacker.
I'm not providing a patch for your file(s), because the
if-then
construct inavm_kernel_image.py
looks odd to me, too. I can't see, why you want to unpack the contained kernel only, if the present file does not contain a SquashFS image - this makes only sense, if you're calling this function recursively and the splitted kernel image from the first call is unpacked with a second call later. This makes the logic a bit obscurely to me - so I'll better let you rule, which changes are needed.And looking into
squash_fs.py
(https://github.com/fkie-cad/fact_extractor/blob/master/fact_extractor/plugins/unpacking/squashFS/code/squash_fs.py), the needed changes seem to be more expansive ... currently there aren't different command line options (per tool) while "probing" the right tool to unpack data.As long as the oldest SquashFS image to process uses SquashFS3 format (and not an earlier one), the tools for SquashFS4 format will be able to unpack this, too, and another try with
unsquashfs3-multi
should not change the results anymore, if the v4 tools were unable to unpack a file.