fkie-cad / fact_extractor

Standalone Utility for FACT-like extraction
GNU General Public License v3.0
80 stars 31 forks source link

Support more HP printer updates #37

Open dorpvom opened 4 years ago

dorpvom commented 4 years ago

There are some HP printers that use pjl for updates but function different from the supported ones. One example is [1]. The compession seems to be different from the truncated lzma streams support already. See discussion in #7 for more information.

[1] ftp://ftp.hp.com/pub/softlib/software13/printers/ojp6970/1910a/OJP6970_1910A.exe

pabx06 commented 4 years ago

I was able to dump some data from the printer. But was not able to figure out the encoding. The online payloads seems to have a section where a prefix replaces 0x000000 . Then a block of 4 to 5 bytes is inserted every few bytes. it seems like some kind of compression like lzss/lz77.

This is the block 0 out of 1024 blocks:

block 0 = block 1 = block 2 = block 3 . for safety reason first block is duplicated others block include assets like image png/bmp/gif/jpeg, html,json,xml ,i18n text, TLS root certificates...

Ghidra seems able to handle it using ARM v8 big endian loading at 0x0 gives the reset vector witch look fine. i didnt find a filesystem nor any clue on blocks layout. except some corrupted FAT 12/FAT 16 partions witch looked like the remain of an older firmware...

QuciBMS project seems to be able to handle a lot of prop format. However i don't have the know how nor experience ...

extracting bins from online payload seems more reliable than hardware extraction that could extract deleted artefacts

There is no datasheet for this SoC with custom marking. Nor any idea on how memory is mapped. Could some one could gives a hint ?

weidenba commented 4 years ago

You did a good job so far. I would like to give you some hints that might help you.

pabx06 commented 4 years ago

thanks to the pandemics. had some spare time. i was able to decode the payload and reverse the two staged flashing process. reverse the stage of the boot loading. and the flash => RAM decompresion/loading process. located the entry point . hit decompile on it. And it's 30m later ghidra has not yet finished . like 8k string in this firmware and ghidra is still processing .

i can see fileX and ThreadX and some vxworks string so far ...

i think it is a pretty fat firmware...

dorpvom commented 3 years ago

Hi, if the payload decoding process could be automated it would be cool if we'd integrate it into our extractor. Can you tell if there is some kind of file magic or other distinct pattern in the binary that can identify it as hp firmware?

pabx06 commented 3 years ago

working on a tool to automate this stuff it is in java.

yup there is three magics that are searched to locate the table that define segment loading/decoding: however PCL Printer Command Language parsing is needed then raster language decoding and decompressing is need. before.

i was able to code something for PCL it but it is still hackish and use hard-coded offsets might not work for all payload version. i have made a ghidra script to setup the memory segments.

i will publish java src on github rep when i finish hopefully soon.

just wonder if hp would unlish h That payload is the same for the entire hp printer line of product :envy & photosmart & deskjets . And it has some very long prime number 2048 bit wonder if they are used for payload auth/ink subscription/cloud storage/or webserver ...

it even has url & credential for their internal LAN scm rep... and many others awfully stuff

magics in the RAM:


           47ffc004 3c a5 5a 3c     uint      3CA55A3Ch               magic _1        
           47ffc008 00 00 00 6c     uint      6Ch                     size          Total size of the 
           47ffc00c 04 67 04 09     uint      4670409h                magic_2
           47ffc010 fb 98 fb f6     uint      FB98FBF6h               magic_3

magic_2 & magic_3 are not compressed on my version of the update payload :

image

but more reliable to scan for this: since others payload used this header

1B 25 2D 31 32 33 34 35 58 40 50 4A 4C 20 43 4F
4D 4D 45 4E 54 20 28 6E 75 6C 6C 29 0A 40 50 4A
4C 20 45 4E 54 45 52 20 4C 41 4E 47 55 41 47 45
3D 46 57 55 50 44 41 54 45 0A 1B 45 54 68 69 73
20 64 65 76 69 63 65 20 64 6F 65 73 20 6E 6F 74
20 73 75 70 70 6F 72 74 20 46 57 55 50 44 41 54

image image image

nhvswxh commented 1 year ago

Can this firmware be decrypted? I have tried some methods. The first step of extracting * bXW data failed. Can you provide some help