ZhangJun2017 / QQChatHistoryExporter

导出手机QQ聊天记录为网页
MIT License
35 stars 4 forks source link

msgtype -2011 分享消息 的解码 #4

Open lqzhgood opened 2 years ago

lqzhgood commented 2 years ago

我在研究 -2011 分享消息的解码 https://github.com/ZhangJun2017/QQChatHistoryExporter/blob/f97eb64581229a30514d55aa0a8423b138b09437/src/RawMessage.java#L69

先通过 decryptProtobuf 解码以后,二进制数据是 ac ed 00 05 开头,搜索后得知是 Java Serializable 序列化后的格式(这里我纯二进制硬解搞了我2天…… 后来才知道是 Java 的序列化格式 )

可以通过 https://github.com/NickstaDB/SerializationDumper 进行解码。

通过 SerializationDumper 解码后可以看到数据结构很简单,应该就是一个 { key: BigString } 的结构,我不知道如何写这个 Java 代码来还原 Java 的序列化。

java -jar SerializationDumper-v1.13.jar ACED00057A000004000000000100000014000000010003776562000000000000006B68747470733A2F2F6D2E776569626F2E636E2F7374617475732F343232353439323339333337383532343F736F75726365547970653D71712666726F6D3D3130383431393530313026776D3D31343031305F303031332666656174757265636F64653D6E65777469746C6501795BE58886E4BAAB5D20E98791E6AF9BE78B97E5AD90E59392E6AF9BE6AF9BE7BABAE68890E7BABFEFBC8CE5868DE7BB87E68890E6898BE5A5975BE4BA8CE593885DE7ACACE4B880E6ACA1E7BABAE6B1AAE6989FE4BABAE79A84E6AF9BE6AF9BEFBC8CE6AF94E683B3E8B1A1E4B8ADE79A84E8A681E69F94E8BDAFE5A5BDE7BABA5BE686A7E686AC5DE4B8BBE4BABAE5AF84E69DA5E59392E6AF9BE6AF9BE69C89E6B585E889B2E5928CE6B7B1E889B2E4B8A4E983A8E58886EFBC8CE899BDE784B6E6B2A1E585BBE8BF87E78B97E5AD90EFBC8CE4B88DE6B885E6A59AE4B88DE5908CE9A29CE889B2E79A84E6AF9BE6AF9BE698AFE587BAE887AAE593AAE4B8AAE983A8E4BD8DEFBC8CE4BD86E6B7B1E889B2E983A8E58886E79A84E8A681E6AF94E6B585E889B2E79A84E5A5BDE7BABAE5BE88E5A49AEFBC8CE7BB99E683B3E887AAE5B7B1E7BABAE4B8BBE5AD90E6AF9BE79A84E993B2E5B18EE5AE98E4BBACE58F82E88083E4B880E4B88BE3808220E2808BE2808BE2808B0000000100046974656D0000000200000009000000000000000000000000000000000000000000000000000000030007706963747572650000000A0015687474703A2F2F75726C2E636E2F356242316F3975000000000000000000000000000000000000000000057469746C65000000090000000000000053E98791E6AF9BE78B97E5AD90E59392E6AF9BE6AF9BE7BABAE68890E7BABFEFBC8CE5868DE7BB87E68890E6898BE5A5975BE4BA8CE593885DE7ACACE4B880E6ACA1E7BABAE6B1AAE6989FE4BABAE79A842E2E2E00000000000773756D6D61727900000009000000000000001CE69DA5E887AA205A6565656565656565656E20E79A84E5BEAEE58D9A00000000000000000000000000000000000006011F870015687474703A2F2F75726C2E636E2F35774E306B4F62000CE696B0E6B5AAE5BEAEE58D9A0015687474703A2F2F75726C2E636E2F3531537076746900036170700000000E636F6D2E73696E612E776569626F001374656E63656E743130303733363930333A2F2F00000000000000000000000000000000000000000000006B68747470733A2F2F6D2E776569626F2E636E2F7374617475732F343232353439323339333337383532343F736F75726365547970653D71712666726F6D3D3130383431393530313026776D3D31343031305F303031332666656174757265636F64653D6E65777469746C77956500000000000000000000000000000000FFFFFFFF0000FFFFFFFF0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000

STREAM_MAGIC - 0xac ed
STREAM_VERSION - 0x00 05
Contents
  TC_BLOCKDATALONG - 0x7a
    Length - 1024 - 0x00 00 04 00
    Contents - 0x0000000100000014000000010003776562000000000000006b68747470733a2f2f6d2e776569626f2e636e2f7374617475732f343232353439323339333337383532343f736f75726365547970653d71712666726f6d3d3130383431393530313026776d3d31343031305f303031332666656174757265636f64653d6e65777469746c6501795be58886e4baab5d20e98791e6af9be78b97e5ad90e59392e6af9be6af9be7babae68890e7babfefbc8ce5868de7bb87e68890e6898be5a5975be4ba8ce593885de7acace4b880e6aca1e7babae6b1aae6989fe4babae79a84e6af9be6af9befbc8ce6af94e683b3e8b1a1e4b8ade79a84e8a681e69f94e8bdafe5a5bde7baba5be686a7e686ac5de4b8bbe4babae5af84e69da5e59392e6af9be6af9be69c89e6b585e889b2e5928ce6b7b1e889b2e4b8a4e983a8e58886efbc8ce899bde784b6e6b2a1e585bbe8bf87e78b97e5ad90efbc8ce4b88de6b885e6a59ae4b88de5908ce9a29ce889b2e79a84e6af9be6af9be698afe587bae887aae593aae4b8aae983a8e4bd8defbc8ce4bd86e6b7b1e889b2e983a8e58886e79a84e8a681e6af94e6b585e889b2e79a84e5a5bde7babae5be88e5a49aefbc8ce7bb99e683b3e887aae5b7b1e7babae4b8bbe5ad90e6af9be79a84e993b2e5b18ee5ae98e4bbace58f82e88083e4b880e4b88be3808220e2808be2808be2808b0000000100046974656d0000000200000009000000000000000000000000000000000000000000000000000000030007706963747572650000000a0015687474703a2f2f75726c2e636e2f356242316f3975000000000000000000000000000000000000000000057469746c65000000090000000000000053e98791e6af9be78b97e5ad90e59392e6af9be6af9be7babae68890e7babfefbc8ce5868de7bb87e68890e6898be5a5975be4ba8ce593885de7acace4b880e6aca1e7babae6b1aae6989fe4babae79a842e2e2e00000000000773756d6d61727900000009000000000000001ce69da5e887aa205a6565656565656565656e20e79a84e5beaee58d9a00000000000000000000000000000000000006011f870015687474703a2f2f75726c2e636e2f35774e306b4f62000ce696b0e6b5aae5beaee58d9a0015687474703a2f2f75726c2e636e2f3531537076746900036170700000000e636f6d2e73696e612e776569626f001374656e63656e743130303733363930333a2f2f00000000000000000000000000000000000000000000006b68747470733a2f2f6d2e776569626f2e636e2f7374617475732f343232353439323339333337383532343f736f75726365547970653d71712666726f6d3d3130383431393530313026776d3d31343031305f303031332666656174757265636f64653d6e65777469746c
  TC_BLOCKDATA - 0x77
    Length - 149 - 0x95
    Contents - 0x6500000000000000000000000000000000ffffffff0000ffffffff0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000
ZhangJun2017 commented 2 years ago

完全没有接触过序列化....想要使用Java代码解码的话,试试这个?假如能够解码的话,表情包表情(msgtype -2007)就可以正常显示名称了

lqzhgood commented 2 years ago

序列化其实都接触过,只不过平时基本都是用 JSON 格式,而这个是 Java 格式的序列化

这里序列化是通过 Serialization接口,该序列化方式是 java 独有的,与 Json 对比不仅可以序列化 Boolean Number String Object Arrary等,还能序列化 Class

序列化的过程基本是这样的,其实跟 proto 的解密类似了 实例 -> Class(获取类型) -> 二进制 其中 Class 相当于 proto 文件

SerializationDumper 这个工具能帮你推测出 Class 的结构,但并不能直接获取 实例 所以只能通过推测出的结构重写 Class ,然后通过 Java 代码结合 Class 反序列化出 实例

这是我这几天的研究结果,由于我不会 Java ,只能看你有空研究研究了。 我会从二进制的角度看能不能还原,但如果二进制有通解,这基本等同于重写 JavaSerialization 接口,太难了。 我估计我只能从我的几百个样本中求出特解吧

上面的样本就是通过你说的 SerializationDumper 这个库解码出来的,看样子结构很简单,貌似就是一个大数组(BLOCKDATA)之类的,所以 Class 结构应该不会很复杂,相比从二进制角度来解应该难度低很多。

lqzhgood commented 2 years ago

你说的 https://github.com/NickstaDB/SerializationDumper/issues/18#issuecomment-1005913508 这个我不知道如何跑起来。