hozuki / libcgss

libcgss is a helper library for THE iDOLM@STER Cinderella Girls Starlight Stage (CGSS/DereSute/デレステ). It currently supports HCA audio decoding and ACB exploring. It also applies to other games like THE iDOLM@STER Million Live! Theater Days (MLTD/MiriShita/ミリシタ).
Other
94 stars 8 forks source link

Decrypt errors for latest ADX2 SDK ver (v2.98) #4

Closed FZFalzar closed 6 years ago

FZFalzar commented 6 years ago

Hello again, I've been trying to use your tool to do extract+decrypt but I am encountering decryption errors. The key is correct because when I loaded the ACB into v2.98 CRIWARE SDK CriAtomViewer it plays fine. I have uploaded a sample AWB+ACB and key inside. Audio format is HCA

If you need a link to the SDK I am contactable via https://steamcommunity.com/id/fzfalzar

-Falz bgm.zip

hozuki commented 6 years ago

Thank you. From the keys you use this should be from Dragalia Lost.

It seems CRI upgraded both ACB and HCA. For ACB they added a mask for byte alignment, which took me some time to figure out. About the HCA file, it looks pretty normal, when comparing header parts one by one. One noticable difference with the files I experimented before is the header size is 397 (0x18d), much larger than old ones, 96 (0x60). Anyway, all the spaces excluding header data and the checksum are filled with 0.

Here is my guess. If HCAs created by older SDKs can be played in the lastest Atom, there are definitely some "signatures" in the file structure itself, not directly recognizable, that the new player understand and so it can decode the HCA in the old way. Possible changes I can imagine are: 1) changing the decoding process; 2) adding some conditional branches to the existing decoding process; 3) changing value tables; 4) transforming encryption keys (after input, before initializing ciph part). Sure, combinations of these are also possible. Method 1 may bring stronger side effects to CRI's product development so I think it is less likely to happen than the others. Some explanations to method 2. The main internal parameters of the decoding process are in comp part (i.e. comp1-comp8), whose values' meanings are not easily observable. Maybe, just maybe, the new decoding process used some values or value pairs that never appeared in older decoders, so new value handlers may have been introduced.

Well I think I can still get a latest version of the free version of the SDK but I don't know if it supports decoding encrypted audio. I'll try it later.

By the way I have sent you a friend request on Steam. You can contact me there but I seldom open Steam these days, mostly because I don't have time to play those games. :/

hozuki commented 6 years ago

All right, ADX2LE does not provide encrypting/decrypting functionalities, even in CriAtomViewer. (sigh) I'll start RE instead.

FZFalzar commented 6 years ago

I couldn't find any imports in CriAtomViewer that would suggest decoding functionality, I would think it is baked into the app directly

I tried to sort functions by size, there were a few large ones with seemingly many float datatypes but I have no experience in sampling code :/

hozuki commented 6 years ago

Can the new player play HCAs of previous versions?

hozuki commented 6 years ago

FYI, I'm using the "lite" version of the SDK (ADX2LE), not the full version (ADX2). It can only generate type-0 HCA (in plain text). Its player treats all HCAs as type-0; maybe it can recognize type-1. In all, lite version does not provide a way to encrypt (in Atom Craft) or decrypt (Atom Viewer). There is no place to enter encryption keys.

FZFalzar commented 6 years ago

I'm trying to find any leftover ACBs in my PC to try and test on the new version

The CriAtomViewer that I tested with is of the full version and not the Lite Edition, there is an option in player preferences to set a key screenshot_2018-10-10-20-14-12-076_com google chromeremotedesktop

hozuki commented 6 years ago

You can use this file to run a test. The key is ‭59751358413602‬ (00003657 f27e3b22).

song_1001_mod.zip

FZFalzar commented 6 years ago

It plays fine hca

FZFalzar commented 6 years ago

Okay I found an ACB of Theater Days' opening music shipped with the APK, plays properly with the correct key as well

FZFalzar commented 6 years ago

@hozuki I noticed that some fields in the header are XORed with 0x80 as follows: xor

hozuki commented 6 years ago

@hozuki I noticed that some fields in the header are XORed with 0x80 as follows: xor

That's normal. When reading HCA header each header signature must be ANDed with 0x7f to "neutralize" the effect of XORing 0x80. Surprisingly, header signatures in official HCA builds (created by the full SDK, I guess) are always XORed. Signatures in type-0 (or type-1) HCAs (created by ADX2LE) are not XORed.

The file I posted above is a homebrew HCA, encoded using the ADX2LE encoder and transformed by hcacc to apply encryption. Those signatures stay untouched so they are not XORed.

hozuki commented 6 years ago

After looking into libcri_ware_unity.so, comparing the one in Dragalia Lost (2018) and an old version (~2016), criWareUnity_SetDecryptionKey() seems unchanged. Code structure and parameters are the same; differences are caused by the compiler. Usages of global variables it affects (e.g. init state for decoders or something like that) also seem unchanged, but I'm not sure. According to @esterTion the key is directly intercepted before calling criWareUnity_SetDecryptionKey(), therefore the key is not tainted before the function call. I am currently stuck in here.

FZFalzar commented 6 years ago

I have a feeling there is certain new metadata inserted into the ACB that tampers with the HCA in some way; I can open the AWB wave bank file but playing the individual Wave tracks through the viewer gives noise even with the correct key set, whereas if I loaded the ACB and played it through the cues it then works normally

esterTion commented 6 years ago

Weird, acb contains an inline awb (~11KB), but it should be empty since there's a awb by the side BGM_OUT_0001_01_inline.awb.zip


Oops, I used another acb grabbed from ipa, attached one has ~7KB inline awb raid_inline.awb.zip


D:\UserDocument\Downloads\FastHCADecoder-v2.2.1\audio>node a.js raid.awb
AFSArchive {
  r:
   Reader {
     buf: <Buffer 41 46 53 32 02 04 02 00 03 00 00 00 20 00 b2 80 00 00 01 00 02 00 26 00 00 00 e3 4e 16 00 a3 9d 2c 00 63 ec 42 00 00 00 00 00 00 00 00 00 00 00 00 00 ... >,
     pos: 0,
     length: 4385891 },
  length: 4385891,
  header:
   { offsetSize: 4,
     fileCount: 3,
     alignment: 2159149088,
     ids: [ 0, 1, 2 ],
     fileEndPoints: [ 38, 1461987, 2923939, 4385891 ] },
  files: { '0': <Buffer >, '1': <Buffer >, '2': <Buffer > } }

D:\UserDocument\Downloads\FastHCADecoder-v2.2.1\audio>node a.js raid_inline.awb
AFSArchive {
  r:
   Reader {
     buf: <Buffer 41 46 53 32 02 02 02 00 03 00 00 00 20 00 b2 80 00 00 01 00 02 00 1e 00 56 08 96 10 d6 18 00 00 c8 c3 c1 00 02 00 01 8d e6 ed f4 00 02 00 ac 44 00 00 ... >,
     pos: 0,
     length: 6358 },
  length: 6358,
  header:
   { offsetSize: 2,
     fileCount: 3,
     alignment: 2159149088,
     ids: [ 0, 1, 2 ],
     fileEndPoints: [ 30, 2134, 4246, 6358 ] },
  files: { '0': <Buffer >, '1': <Buffer >, '2': <Buffer > } }
FZFalzar commented 6 years ago

I think this inline AWB in some way affects playback of the underlying HCA

If you look at CpkMaker.dll of latest SDK, the AWB header inside CUtf class/CBinary initializes the Alignment field to be some large out of range number, older AWB format had it at 32 unless set

hozuki commented 6 years ago

I think this inline AWB in some way affects playback of the underlying HCA

If you look at CpkMaker.dll of latest SDK, the AWB header inside CUtf class/CBinary initializes the Alignment field to be some large out of range number, older AWB format had it at 32 unless set

The alignments in new versions should be masked with 0xffff. But I don't know if higher bits matter in the later decoding process.

hozuki commented 6 years ago

I updated ACB file parsing and extracting in two commits. I haven't pushed to the upstream yet.

Here are all the files included in the ACB. Some are unnamed (not appearing in the cue table).

raid.acb_files.zip

hozuki commented 6 years ago

I think the higher bits do not affect decryption (not decoding, sorry). If you directly play the extracted HCA (so no interference from ACB stuff) using the key it should still be OK. If so the truth will lie in HCA decoder itself.

FZFalzar commented 6 years ago

I just tried it, still noise, even with correct key

hozuki commented 6 years ago

Well this is interesting... two-phase keys.

FZFalzar commented 6 years ago

I just dropped a link to the new CriAtomViewer in your steam to try

hozuki commented 6 years ago

@FZFalzar Thanks for the tool.

I did some tests, including:

Here are some results:

An example of the internal AWB is OUT_COMMON.acb (in Dragalia Lost's APK assets). raid is an example of the external AWB.

hozuki commented 6 years ago

To be clear, if the AWB is an external one, its header also appears in StreamAwbHeader table in the ACB. I tested (arbitary/original higher bits) x (modified header in ACB and/or AWB), all the combinations work without error.

FZFalzar commented 6 years ago

Thanks for your analysis so far, I think it has to do with how all the assets are marked as StreamingAssets using the API

hozuki commented 6 years ago

That still didn't anwser the full question, where is the "switch" and how does it affect later prodecures.

Some interesting facts observed:

  1. The original blocks, directly read from file, passed data integrity check. (workflow: block data -[compute checksum]-> integrity check -[decrypt]-> plain data -[decode]-> waveform)
  2. No error occured on all-0 blocks; note that the decryption table always maps 0 -> 0.

I located the decoding functions in Atom Viewer. The code seems... different from known source code. There are also some new tables. Maybe it is because of many interferences. I'm not able to go through it yet. I don't know what the code was like in previous (full) versions. In ADX2LE SDK v2.06 (currently latest) the decoding pattern is almost the same to known source.

FZFalzar commented 6 years ago

I so happen to have a 2017 version of the SDK, what's peculiar is that the CriAtomViewer in this version gives noise when ACB with correct key is inserted, unlike the latest one, something must have changed. I can pass you this one too

hozuki commented 6 years ago

@FZFalzar Thank you. Some time ago I was going to ask if you can send a copy. But before I reply this message here (or on Steam) the decryption was done. So it is not needed; but you can still keep it for future research purposes, I guess. Again, thank you for your kindness.

Now good news: I just found out what they do to the decryption. The "garbage" higher bits are used to get the real decryption key. I reversed about 15 functions, read them thoroughly, understood nearly every line, and then this tiny mod was revealed. Well you also need some RE techniques, static and dynamic (mostly static) analyses... And yes I decrypted and decoded the audio file. Here it is, as a proof.

BGM_IN_RAID_0001_01.zip

About the ACB itself. It is composed by 3 tracks (or 2, because the last two seems to be identical), all of which are mixed in the final playback. So the file directly extracted contains only the soundtrack (w/o human voice).

Thanks to VGAudio, I made sure that the encoding/decoding skeleton, as well as the tables, are very unlikely to change, so I could focus on tracing the decryption keys.

I'm very tired now so I'll write an article explaining the "new" encryption soon. Hopefully tomorrow. The formula is actually very simple: key' = key * ((uint64_t)(garbage << 16) | (uint16_t)(~garbage + 2)). (Obviously, when garbage == 0, like old previous ACBs, the key is unchanged.)(Sorry this sentence is wrong. They checked if garbage == 0.) But finding it is quite hard. However, it still cannot explain why the decryption & decoding still succeeded even if I changed these bits. With this "new" encryption, it will be unsafe to directly decode HCA in the future. You have to consider the ACB/AWB file as well. As a side effect, the code in this repo also has to be updated... (sigh)

blueskythlikesclouds commented 6 years ago

Is this "garbage" unique for each AFS2?

Thealexbarney commented 6 years ago

Not that you're wrong, but if you're posting proof it might be better to post the decrypted HCA =P raid.zip

hozuki commented 6 years ago

@Thealexbarney Fair enough. :D

hca_proof.zip

The original tracks (3 individual, not 2, I didn't decode all at first), and their type-0 equivalents (simply using hcacc and voila). Decoded WAV files are too large so I'll not post them here. The true key after transformation is 0x5580E165A92C2C63 for this ACB/AWB.

hozuki commented 6 years ago

@blueskythlikesclouds

Is this "garbage" unique for each AFS2?

Yes. CRI stores them in the two unused bytes, right beside the byte alignment. The former layout: 02 00 00 00 where all four bytes are considered as a uint32_t (align=32). But in practice the two higher bytes are always unused, so that's a nice place to hide something: 02 00 cd ab where 0xabcd is the secondary key ("garbage", lol).

hozuki commented 6 years ago

@blueskythlikesclouds An example of different key: OUT_COMMON.acb (inside Dragalia Lost's APK assets folder) has the secondary key 0x5c83; this dword is 20 00 83 5c at offset 0x382c.

Thealexbarney commented 6 years ago

Or you could read offset 0xE in the awb file.

BTW, don't know if you know already, but VGAudio has an HCA encoder. Much faster than hcaenc.dll too.

hozuki commented 6 years ago

Or you could read offset 0xE in the awb file.

BTW, don't know if you know already, but VGAudio has an HCA encoder. Much faster than hcaenc.dll too.

Yes I saw that. I accidentally found it in Google when I didn't know where to go. At that time I was beginning to reverse all the decoding code. And it is your work (both the encoder and the decoder, plus the theory unveiled) that tells me, that there is actually a way to decode HCA without using the tables in the official decoder. Thank you. Based on this fact, I was able to make a deduction. Now that they keep the tables unchanged, and the decoding code unchanged (can find part of the pattern, have to make a guess), something else must have been changed. Combining with the test results, it is probably related to the ACB/AWB itself, like the discussion above.

When I opened the GH repo tree and found out it can encode HCA, I was shocked. Then I tried a bit with various files. Just playing. XD I haven't compared the speed; but a managed and open-source implementation is already a great idea. I'm always afraid of signal processing though.

FZFalzar commented 6 years ago

I get the feeling @hozuki

DSP is a monster for the untrained, but really good work to all involved! I do hope this applies to new ACB/AWB formats and not just a one-off event for Dragalia. I doubt this is the case though since this is an implementation only CRIWARE can control

I'm looking forward to reading the write-up, and I'd send beer money if I can :D

hozuki commented 6 years ago

Hi all. The English translation is finally finished. Code for a new tool, acb2wavs, were pushed to DereTore and this repo. So generally everything is set and I think this issue can be closed now. Thanks everyone for participating. :)

ActualMandM commented 3 years ago

wanted to clarify, since sonic colors ultimate shipped its xml files: it's actually the awb hash se_voice_system_E.zip

AwbHash="30898" 2021-09-13_21-50-44_HxD