robertabcd / lol-ob

a league of legends spectator mode game downloader
MIT License
83 stars 22 forks source link

Question #1

Closed ryancole closed 9 years ago

ryancole commented 11 years ago

Hello,

I'm trying to understand your blowfish decryption of the chunk and keyframe data. I had a few questions. The payload header contains an encryption key, but in decrypt.rb LN31-33 it looks like you're pulling the blowfish decryption key from the JSON metadata portion of the file.

To decrypt the keyframe and chunk data, we do not use the encryption key from the payload header, but instead we use a key derived from the JSON metadata - is this correct?

This is confusing me, because in the JSON metadata that I'm extracting from my own replay files, I do not see the gameKey.gameId, or key keys that you are using to extract the decryption key. I do see a gameId, which I assume is the same as the first, but I do not see a JSON metadata property named key. Have these changed, or am I not understanding your code? I'm not a Ruby programmer and your code is pretty straightforward, but I may be misinterpreting something.

Thanks!

lukegb commented 11 years ago

Here's some code I wrote with robertabcd's help - its a Python script which should successfully parse and decrypt LoL (official) replays.

To clarify, the JSON file referred to in the decrypt script is a custom one, not the Riot metadata.

https://gist.github.com/d2997a5fc7970ce6e1e1

lukegb commented 11 years ago

I'd love to be updated on anything you're doing - trying to figure out the key frame/chunk format myself ;-)

ryancole commented 11 years ago

@lukegb Ah. Ok that makes some sense then. I saw a disconnect between the Ruby script that reads the file, and the other script that decrypts the chunk data. So, I figured it might be something like that. I'll check out your code. I've basically just converted his Ruby code to C#, and took more a stream-based approach to reading it. My code is here: https://github.com/ryancole/LeagueReplayReader

Also, thanks for your code. I will check it out and see if it clarifies anything!

Edit: Could you possibly give a simple explanation of the answer to my question, also? I'm looking over your code, but it has been a long time since I've used Python. So I just want to be sure I understand it properly as I look over it. What key is used for decrypting the chunk and keyframe data?

robertabcd commented 11 years ago

First to answer your question. When decrypting ROFL files, the gameId and key in the header should be used, and there should be no key in the metadata json.

It comes with a historical reason. The decrypt.rb is originally designed to work with the output of download.pl, and it is written before the ROFL format is reversed.

download.pl produces meta.json by combining json returned from getGameMetaData, and a user-supplied key attribute added in. (The key only can be obtained from the retrieveInProgressSpectatorGameInfo RPC call through PVP.net client, or featured games json. Not exposed in the "replay cloud.")

After ROFL format is reversed, I found that keyframes and chunks are identical to those downloaded from the "replay cloud", so I tried to reuse the script. Thus, the json metadata in ROFL files contains different information.

Keeping this open unless clarified somewhere in the source tree.

lukegb commented 11 years ago

Clarified file format at https://github.com/robertabcd/lol-ob/wiki/ROFL-Container-Notes

Might want to edit that - I just braindumped into it.

themasch commented 10 years ago

Do you know if the same format applies to the chunks coming over the Spectator REST API? I tried to decode such a chunk using the described mechanism but just getting random data. zlib (Gunzip) fails with "incorrect header check" [EDIT] whoop nvm, working! Just confused "key" with "password" when creating a decipher.

Tastefull commented 10 years ago

Hi there,

I have a little "project" that i would like some help with if posible. Its not to hijack the thread, but its relevant to the decrypt of the chunk part your talking about.

Outline of what I want to do: We are doing a lot of live streaming of our LoL tournaments and league, but we cant stream them all. So i want to make a "LoL text event echo'er". I have done this many years ago for Counter-Strike and is a simple way for people to follow a match to get the main actions. I would integrate it into to a IRC Bot and do the "live echo" in a channel there.

The output im looking for is something similar to this: ** Player 1 (2/1/4) killed Player 2 (1/3/2) ** Teamscore: Blue 11 (2 turrents) - 2 Red (2 turrents) ** Player 3 killed the dragon ** Player 2 destryed a outer turrent etc etc

Is this in anyway posible when decrypting the stream from the spectator stream?

Kind regards \Thomas Hansen

themasch commented 10 years ago

Thats pretty much what my goal for this project was. I'm hoping that we get all this information when we are able to decrypt the stream. I'd really love to get some people together and collect some know how to be able to get this done, currently I don't see a way to make any progress. Downloading spectator data is possible, decrypting and decompression seems to work but I've no clue how to read the data. Enconding is still unclear.

Zero3 commented 10 years ago

@themasch Count me in. I got a couple of ideas too, but a little lack of time at the moment. I have previously spent some time inspecting the data though, and the good news is that it definitely is structured in some way. The bad news is that I don't yet know in which format. If you start something up, I'm interested in hanging around. I can't promise when I will have time to properly dig into this though.

themasch commented 10 years ago

@Zero3 that sounds good. I know these time issues very well, suffering from the same. Might drop some code soon but I can't promise anything currently. I'll let you know if something becomes availiable.

Tastefull commented 10 years ago

Can you try to send me som samples of the chunks that are decrypted and decompressed, then i'll try to take a look at them :)

\T

robertabcd commented 10 years ago

This project may help: https://code.google.com/p/packet-lol/ Last time I tried, it matches some part of data in chunk files, but not matched perfectly.

ryancole commented 10 years ago

You're not going to be able to make sense of the chunks without being able to look at it inside of a debugger. There are some plain text strings throughout it, but the majority is just bytes of data.

jaagupkymmel commented 10 years ago

I would love to help, but I dont't have any experiance with decrypting packets etc.

themasch commented 10 years ago

@ryancole i'm afraid you are right.

I just collected any information I could find into one document (with aweful english). If someone know something thats lacking, feel free to add it: https://gist.github.com/themasch/8375971

Divi commented 10 years ago

@themasch https://gist.github.com/themasch/8375971#endofgamestatsregion-gameid--amf- endOfGameStats is actually an AMF base64 encoded file. Here an example after converted AMF to an array : http://pastebin.com/KB4TUPhs

themasch commented 10 years ago

@Divi wow, nice. Thanks. Will update the document soonish.

robertabcd commented 10 years ago

@themasch Just noticed. Consider changing "region" to "platformId", which conforms to the JSON returned from "../featured"?

Divi commented 10 years ago

@robertabcd In the RTMP API, this value is called "originalPlatformId" when retrieving current game data.

robertabcd commented 10 years ago

@Divi I see... Any ideas on differences between original and not-original? Edit: mentioned wrong person.

themasch commented 10 years ago

I think platfromId does the job until we know if theres original and a not-original version. I just updated the gist. Should I make a normal repository out of this so we can have issues and commits and stuff?

Divi commented 10 years ago

@themasch I guess repository is better. @robertabcd The RTMP API has pretty old data/keynames, I guess platformId is correct. Btw, if you want to see the full current game API response : http://pastebin.com/rZx4tjEE

EDIT: About the packet sniffing, maybe we should check some of hacking/cheat forums on LoL, they are often the first to decode packets.

This one has many tools on the forum : http://botoflegends.com/forum but it seems to be down a lot of time (DDoS or host issue, don't know).

Divi commented 10 years ago

@robertabcd @themasch I made a lot of researchs, and spectator packets are not the same as live game packets. Chunk/keyframe files are clearly a list of packets, but I don't know the separator between each packet. I don't have the skill for decoding packet, maybe @Zero3 with his idea can do that.

Zero3 commented 10 years ago

@Divi If a spectator "frame" actually contains several "packets", there might not even be a separator (it might be implicit given the context). It really all depends on the encoding used, which we don't know yet. I think decoding the packets will be a fun challenge, and I really want to help out with this once I get some time on my hands (which unfortunately is not in the very near future because of studies). Hint: There are a lot of plaintext strings in the decrypted packets which should make this job significantly easier.

Zero3 commented 10 years ago

By the way: If someone goes ahead with this, they should definitely get in contact with the guy behind http://www.leaguereplays.com/. By the looks of the decompiled .NET code, this guy knows a lot about the internal data structures used by LoL. Chances are that a lot of things are reused in the spectator packet format. There is of course also the possibility of asking Riot about the format. They probably won't make this stuff public (given their efforts with encrypting the packets in the first place), but perhaps they are willing to cooperate under an NDA or some other legal contract about the purpose of using the data. Who knows? :).

Divi commented 10 years ago

@Zero3 I don't think "LoL Replay" knowns about data structure, because it just downloads and put chunks and keyframes in one unique file. And when a player want to watch a game, it creates a local REST server readable by LoL Client. Btw, good luck for yours studies :)

themasch commented 10 years ago

@Zero3 not sure about how much these guys know. Besides that, I'm no fan of stuff like NDAs at all ;) @Divi @robertabcd @ryancole and all the others: https://github.com/themasch/leaguespec I just created a repository for collecting informations. I'd really love to see pull request and issues pile up ;) I don't feel like giving random ppl commit permissions yet but I'll give them later to ppl who participate in the project and behave nicely. I hope you understand that, I'm just trying to prevent a mess. Is that fine for everyone?

ryancole commented 10 years ago

As most people know, the chunks and keyframes within a replay file do indeed contain packet data. This has been confirmed to me by two friends of mine who both work at Riot. But even without that, I think it's pretty obvious to most people who have seen the chunks and watched that data flow through League in a debugger.

Anyway, the packets are not delimited by anything, based on my research. Some packets have static sizes, and so the size of the packet within the chunk is known / hard coded. Other packets, with variable sized payloads, have length identifiers, which tell you how large the packet is going to be on a per packet basis. This is common among many different game protocols. namely Blizzard products.

Now, I spent about a month looking at the replay file's chunks going through the League client, in a debugger. The logic is pretty simple and works as you'd expect. The game uses the chunk data to increment the game state in real time, as you'd expect. I never did manage to fully map out any significant packet format. I have an extremely well documented IDA project file, from about 7-8 months ago, in which I have the replay system's main loop fully documented. I have bookmarks set on many of the different packet handler functions.

The reason I stopped working on this though is because i encountered some code that completely confused me and I just couldn't wrap my head around it any more. It threw me out of the zone mentally, if you will. Basically, while debugging one the packet handlers, I watched as the League client was doing it's normal parsing of the replay file, grabbing a length, reading in the data, etc, and then all of a sudden the function decided to go read some data from way down in the bottom of the chunk file. This is confusing, because you'd expect it to just read from top to bottom (which it does do for the most part). This lead me to believe that the packets, in the chunk file, are able to reference sort of a shared memory or something.

Basically, my final concluding thought on the replay files is that a single chunk contains more than just packets. It contains packets, as well as some sort of shared state buffer or memory. Packets can reference this chunk of memory, it looks like. The packet came first in the chunk file and the shared memory appeared to be at the bottom of the chunk. That's about all I determined before I decided to take a break on it.

ryancole commented 10 years ago

Does anybody in here happen to have a .rofl file that I could test with? My PBE account appears to have been marked inactive.

Divi commented 10 years ago

@themasch thanks for the repo :) @ryancole what kind of soft/debugger do you use to inspect the client ? I haven't .rofl file unfortunately. I'm pretty sure that if you ask for a file on Reddit, someone will give it to you in PM.

ryancole commented 10 years ago

@Divi I use a program called IDA Pro. It's not free, but it's real nice. There are some free tools out there, such as WinDbg or OllyDbg, that can at least debug the program.

Divi commented 10 years ago

@ryancole wow, some options here, I don't know if I'll be able to use it. But thanks, i'll test it :)

tyscorp commented 10 years ago

All of this is very relevant to my interests.

Zero3 commented 10 years ago

@Divi @themasch Thanks :). Regarding LOL Recorder, I think I must disagree. You can check out the PacketLogger.cs assembly of LOL Recorder. Here are some of the interesting lines:

PacketLogger.procWatcher = new ProcessWatcher("League of Legends", "RiotWindowClass", (string) null, 1000); ... PacketLogger.procWatcher.ProcessCreated += new ProcessWatcher.ProcessEventHandler(PacketLogger.procWatcher_ProcessCreated); ... private static void procWatcher_ProcessCreated(Process proc) { ... if (!Inject.InjectDll(proc.Id, Path.Combine(LOLUtils.LOLReplayPath(), "Recorder.dll"), true)) ... }

I don't know what Recorder.dll actually does, but looking at some of the other .NET code seems to indicate that it does indeed record and save some stuff not currently available in plaintext through the spectator API. All I'm saying is that it looks like this guy knows some stuff about the LoL internals that might be useful for this project.

@ryancole This might be an optimization in the serialization format. It is somewhat commonly used for strings (or other payload) that are reused throughout the packet. Instead of including the string each time it is used, it is referenced by a temporary numeric string ID. Then at the end of the packet, an ID lookup table with the actual strings are included. When you transfer a gazillion of these packets per day, these small savings end up being quite significant.

It could of course also be other things, but that is the first thing that came to my mind when I read your comment. I understand your frustration though! :)

@tyscorp Welcome to the discussion :).

Divi commented 10 years ago

@Zero3 the process watcher launch the REST download/API when League of Legends.exe starts. But you might be right about the injector and there is two differents POV option (from spectator or player).

EDIT: @ryancole @themasch @robertabcd Based on your suggestion @Zero3, I decompiled the LOLUtils.dll in LOL Recorder, you can found the decompiled project here : http://bit.ly/1mBsVpe and take a look on the very interesting file Stats\Decoding\PacketDecoder.cs. I think this file is used to decode packets FROM the game (and packet spectator are not the same), but I'm not good with C#, so I may be wrong.

themasch commented 10 years ago

@Divi that really looks like it does decode game traffic. When my downloader finally works with a nice API I might have a try to use some of this code to gather some data.

themasch commented 10 years ago

Okay, I can definitely confirm some ASCII string what basically mean that decryption and unzipping seems to work ;) Founds strings like these: LucianQDamage, Taunt_Thresh, leblancslidereturn, Spell2, ATTACK_PASSIVE, Jack In The Box, ShacoBox, CrestOfTheAncientGolemLines, TestCubeRender (dafuq?), GreatWraith13.1.1, GreatWraith14.1.1 and many many more. That was just a very fast and stupid "find patterns containing only printable ascii with at least 5 bytes"-test. Currently hacking on some more analysis stuff. I failed to find summoner ids, though. Neither in big endian, nor little endian. Looking for champion IDs doesnt make much sense since these are so small that false positives will make >95% of your hits.

Addionally I couldn't find anything that looks like chat. Maybe I just checked the wrong chunks (my software currently only supports one file at a time, that will change) but It might also be because the encoding used in chat is imcompatible with ascii. So it might be UTF-16 instead of UTF-8. Will add a check for that later.

Maybe I'll build some usefull UI tomorrow so I can see some of my algorithms in context to the whole document. If someone is intrested I might provide some code for the downloader (node.js), too.

Divi commented 10 years ago

Thanks @themasch, that's great :) Chat is not registered in chunk, but only broadcasted/received by client/server. LOLReplay use a Packet Decoder to capture a chat message and send it to the viewer when he watches the game.

Have you decrypted some usefull information, like a destroyed turret or a kill ?

jaagupkymmel commented 10 years ago

You could try searching for gold. You know how much gold people have at the start of the match and if you open up the replay, you can just look at the amount of gold someone has at any point of time and search for this. Gold amounts should take atleast 2 bytes, you could also try searching for total gold a team has, since this wouldn't always fit inside a 16-bit unsigned integer, but I'm not sure if this is actually sent, because of how easy it is to calculate this client side.

If you could upload your code and replay file, that would be nice!

themasch commented 10 years ago

I'm currently doing a "blind"-approch becaue I don't have a LoL Client on my Laptop. Will be back on my machine on sunday and will start viewing & downloading games so I can compare the data I recorded with events I've seen in game.

Current state: I'm close to be able to detect jungle paths out of replay data ;)

ryancole commented 10 years ago

Wasn't the replay feature disabled on the PBE ?

Ryan

On Feb 1, 2014, at 9:50 AM, Mark Schmale notifications@github.com wrote:

I'm currently doing a "blind"-approch becaue I don't have a LoL Client on my Laptop. Will be back on my machine on sunday and will start viewing & downloading games so I can compare the data I recorded with events I've seen in game.

Current state: I'm close to be able to detect jungle paths out of replay data ;)

— Reply to this email directly or view it on GitHub.

themasch commented 10 years ago

Don't know, never tested it. Using featured games spectator data currently.

tyscorp commented 10 years ago

All chat is definitely stored in the chunks in ASCII. (just checked)

Edit: I meant actual "All" chat. Team chat is not stored.

themasch commented 10 years ago

@tyscorp great, thanks. So I was just unlucky with picking the chunks.

tyscorp commented 10 years ago

If anyone needs some test data I have ~84k replays.

http://www.proreplays.net/

ghost commented 10 years ago

Hello guys

I had seen the discussion for a moment, and I guess I could help a little bit :)

Last August, I tried to understand the structure of those data. I didn't have a lot of success, but here is what I found, and which I'm 100% sure.

First, the difference between the keyframes and chunks files is the following :

Moreover, for those who are looking for how are the data encoded : they are encoded as float (4bytes). Moreover, in keyframe (I focused on keyframes files), the stats of each hero are grouped, but with some padding inside (I wasn't able to determine what cause the padding to sometimes change)

Here is the order I found : public float Health; public float Mana; public float MaximumHealth; public float MaximumMana; public float MoveSpeed;

    public float MagicResistance;
    public float Armor;
    public float RegenHealth;
    public float RegenMana;
    public float AttackRange;

    public float BaseAttackDamage;
    public float BonusAttackDamage;
    public float CritChance;

    public float LifeSteal;
    public float VampiricSpell;
    public float Tenacity;

    public float AttackSpeed;
    public float BonusAbilityPower;

(when there is a blank line, it means that there seem to be some padding which can change)

I Hope this helps

Have a nice day

2014-02-01 Tyson Cleary notifications@github.com:

If anyone needs some test data I have ~84k replays.

http://www.proreplays.net/

Reply to this email directly or view it on GitHubhttps://github.com/robertabcd/lol-ob/issues/1#issuecomment-33879820 .

Divi commented 10 years ago

@tyscorp interesting, thanks :) @themasch can you provide your code ?

lukegb commented 10 years ago

Ooh, this is starting to get interesting :D

themasch commented 10 years ago

@Divi I will provide the code. I could provide it now but I feel like no one would be able to really use it because its just a ugly pile of code that I hacked together fast. I'd really like to put some effort in it and make it kind of useable before I'll release it. Currently, its nothing more than a simple node.js tool that grabs a number of files and runs a small "looks like ascii" detection on it. Its also able to print out the files, encoded in hex, with the detected ascii parts replaced to be readable. I plan to add more detections and funktions so that, with more and more progress, more and more of the document becomes readable. Maybe (quiet likely) we'd see that this approce doesn't make sense at all for deciphering a unknown protocol. But maybe we get some information out of it. Who knows..

TL;DR: Will release, but don't want to release this mess..

Oh, and @trebonius2: wow. Great work!

themasch commented 10 years ago

Additional: I just created an IRC channel on irc.freenode.com. Its ##loldev (yep, 2 #, freenode guidelines). Feel free to join and discuss!