tibiacast / tibiarc

GNU Affero General Public License v3.0
4 stars 2 forks source link

Support for TibiCAMs 8.00 and later (aka. version 518) #13

Closed gurka closed 2 months ago

gurka commented 2 months ago

Hi,

I added support for TibiCAM file version "518" to my OldSchoolTibia tools today: https://github.com/gurka/OldSchoolTibia/blob/master/tools/libs/recording.py#L71 It has not been extensively tested yet, but it seems to be able to read both 8.00+ files as well as older files (i.e. the change didn't seem to break anything).

The only two things I really had to change was in what I call "simple_decrypt":

  1. Use divisor=6 when version=518
  2. Rewrite the divisor/modulo part, as there was some problems related to signedness and the modulo. Previously, for version 515 and divisor=5, I seemed to have worked around this problem (without even realizing or figuring out the root issue) with this: https://github.com/gurka/OldSchoolTibia/blob/62680d3536f24f696c16bf13e3301d79f84a6667/tools/libs/recording.py#L89 I realized that for divisor=8 it doesn't matter if minus (which I call the value in my code) is treated as an unsigned or signed 8-bit value, e.g. for 0xb2, both 178 mod 8 and -78 mod 8 is 2. With divisor=5, the difference is always 1 (178 mod 5 = 3, -78 mod 5 = 2), which I guess explains why my workaround for 515 worked.

Anyway. I decided to look at tibiarc's TibiCAM parser to see if I could update it as well, but it seems that the divisor=6 for version=518 is already implemented. I also tested the "divisor/modulo" part and it seems to calculate the same values as my tool now does. Instead the issue seems to be with the AES decryption...

Note that I didn't change anything related to the AES decryption at all in my tools, so I don't understand what the issue with tibiarc's parser is, but maybe something to look into.

gurka commented 2 months ago

Actually, after doing some more testing I can see that there is a difference in the output after the "divisor/modulo" part when comparing tibiarc and my tools. It often outputs the same values but not always. Will continue the investigation

gurka commented 2 months ago

This seems to do the trick:

112     if (divisor) {
113         uint8_t key = state->Frame.Length + state->Frame.Timestamp + 2;
114
115         for (uint32_t i = 0; i < state->Frame.Length; i++) {
116             int8_t beta, alpha;
117
118             alpha = key + i * 33;
119             beta = alpha % divisor;
120             beta = beta >= 0 ? beta : (beta + divisor);
121             if (beta != 0) {
122               alpha += divisor - beta;
123             }
124
125             state->Frame.CipherData[i] -= alpha;
126         }
127     }

No idea if the change to uint8_t and int8_t is necessary or if the names alpha and beta still make sense though, I was mostly just doing some trial and error until it worked :D

jhogberg commented 2 months ago

That’s great, thanks!

I suspect that int8_t makes it work through undefined behavior, what with key + i * 33 yielding a negative alpha most of the time. If so, I wouldn’t be surprised if the original code looks much like what you wrote.

Now that we’ve got a solid idea of how it should work, we just need to launder the UB 🙂

gurka commented 2 months ago

Oh yeah, overflowing a signed integer leads to UB. I forgot about that :D

I will take a look at this some time later, when I can focus and try to understand what's really going on instead of just guessing and doing a bit of trial and error

jhogberg commented 2 months ago

Looking at https://github.com/tulio150/tibia-ttm/blob/6f49dc4bacf2828acc49b0d7631f6a5a44af9e57/Tibia%20Time%20Machine/video.cpp#L483, and assuming that CHAR is signed, I'm pretty sure that emulating that undefined wraparound is what we need.

I've merged a UB-free version now that works for all the files I've tested so far between 7.11 - 8.11, thanks again for figuring this out. :)

jhogberg commented 2 months ago

By the way, I saw your infodump branch, and had a gander at https://github.com/gurka/OldSchoolTibia/issues/1.

I've been thinking of splitting parsing and updating the game state, mainly in an attempt to make it feasible to brute-force message/speak type discovery which is a major pain to do manually, but it should also make general data extraction a whole lot easier. I'm hoping to have the time to do that some time this autumn.

gurka commented 2 months ago

What do you mean with splitting the parsing and updating of game state? Wouldn't updating the game state be necessary anyway? Or well, I guess that depends on what kind of information you want to extract.

I've spent some time working on the different tools and scripts I have in my OldSchoolTibia repository, adding support for a couple of other formats, adding scripts like "sort_recordings.py", "guess_world.py", and so on. Most recently I've been working on a script that uses the tibiarc library to analyze recordings and extract information (not that much progress right now: https://github.com/gurka/OldSchoolTibia/commit/d74f8cc1099ef2f9eaa7479eef100fcc3d1491db), but the plan and goal is to be able to do exactly what that bug ticket mentions.

So in essence, "play" the recording and inspect the game state after each packet has been processed and extract things like player name (which could change, if the user re-logged another character during the recording), level, skills, equipment, creatures seen and other interesting events.

Then you could combine this script with the sort_recordings.py and guess_world.py script, and it would automatically scan a given directory (which could contain thousands of recordings); for each recording it would detect format, Tibia version, guess the game world and finally "play" the recording and extract all kind of information and events, and output it all to json or some other format.

jhogberg commented 1 month ago

The idea is to parse all packets into an abstract format (edit: tagged union basically) that can be easily inspected by library users. The user wouldn’t have to know what to do with them in the general case, as the normal flow would be to pass them directly to the routine that updates the game state, but when interested they’ll be free to inspect what happens. This has a few nice benefits:

  1. The user no longer has to poll for changes. Going with your example of relogging, they can just keep a lookout for world initialization packets instead of having to check the game state after every packet. Detecting stuff like unjustified kills, players dying, skill advances, et cetera, becomes almost trivial under this model.
  2. We can have pseudo-packets for things like “new creature” instead of having it in-place like the Tibia protocol (0x61-tagged in-line objects) to simplify that kind of analysis.
  3. It gives us a way to toss chat channels “over the fence” from the library part to the player part, where it’s a lot easier to deal with.
  4. Last but not least, it lets us break the hard dependency on having Tibia data files to parse recordings, as we could realistically convert recordings to a format that can then be parsed without any metadata (it also simplifies redaction which is extra relevant post-GDPR). It’s hard to get away from needing sprites, but those could realistically be fudged with alternative sprite packs.

    This is important for preservation as I asked CipSoft for permission to rehost their data files verbatim, and was denied with the reason “… Please understand that Tibia is and [sic] never has been Open Source, and I hope you understand that we are not happy to see intellectual property of ours being distributed without our permission.” The irony is palpable given their shameless plagiarism of sprites from Ultima, and how their newest fan site TibiCam.com is literally built around your open-sourced work, but it is what it is. At least the project itself is in the clear.

It might be absolute overkill though, and it’ll most likely take a fair bit of time to finish: most things are easy to deal with, but nested stuff like map updates are tricky to express. I hope to find the time to make an honest stab at it 🙂