TrinityCore / WowPacketParser

World of Warcraft Packet Parser
GNU General Public License v3.0
430 stars 356 forks source link

[Suggestion] sanitize personal info from packet parses #171

Open DDuarte opened 9 years ago

DDuarte commented 9 years ago

By @lfrost on http://www.trinitycore.org/f/topic/11075-suggestion-sanitize-personal-info-from-packet-parses/

Honestly, I don't even like the idea of submitting them on a write only forum. I don't like the idea of putting that information out there beyond my control and into the hands of people I don't know at all. I get that the forum is write only in order to prevent unsavory characters from having access to it. But what guarantee is there that the people who DO have access to it are not going to misuse it? What guarantee is there that some talented programmer from Blizzard hasn't wormed his way into the developing process of this open-source project?

My suggestion is that the code for the PacketParser be altered in such a way that it verifies that the data is genuine and then replaces personal (and completely irrelevant) information with garbled text, x's, or something else that cannot be traced back to a source. There is no reason that anyone developing this project could possibly have to need to know my server name, account name, or the names and levels of all my toons.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/8803576-suggestion-sanitize-personal-info-from-packet-parses?utm_campaign=plugin&utm_content=tracker%2F457228&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F457228&utm_medium=issues&utm_source=github).
Shauren commented 9 years ago

This is completely pointless, that data is still in .pkt files. All WSTC does is write the data as a single binary blob (it will never attempt to parse packets to mess with data - it is a dumper after all)

danlapps commented 9 years ago

Aren't most of the submitted data straight from the sniffer ? Not many person submit already parsed packets. So although I like the idea I think it should be from the sniffer point of view if you want a personal security.

DDuarte commented 9 years ago

A possible way to do this is create a new DumpFormatType that takes the original .pkt and outputs a modified .pkt without sensible information. You would then be able to submit this modified .pkt to us.

'Sensible information' is slightly subjective so I need your help to classify and tag it:

Another issue is that it's hard to find the info to remove when we do not parse the packets. It's not a problem in 6.0.3 since we have nearly 100% of successful parses but it might be a problem in older or newer versions.

danlapps commented 9 years ago

Well older version requires less sniffing since there are no "retail" version for verified sniffs anyways. I would focus this dev on 6x and eventually port to older later if possible.

DDuarte commented 9 years ago

Before anyone tries to implement this do explain your idea first (I don't want anyone to do a 10k lines of code pull request for it to be denied.)

danlapps commented 9 years ago

I will write a more complete recap of what I think tonight when I get on either laptop or computer. Right now I'm handling kids and friends so not ideal for a clear explanation !

danlapps commented 9 years ago

So as I was saying, I think this is a great idea. However as you said, the submitted file would need to be another .pkt so you guys can parse it as you need. I think Nayd filtered much of the "personnal or sensible data", maybe some Guild info, like name and roster if they are in any packets, shooting in the dark here, not sure what is parsed. Those could be replaced with generic names "playername" or "homeip" or whatever.

As for the version, I dont really know how people could sniff data for either 3.3.5 or 4.3.4 from any "retail" source, unless some coutry are locked on those that im not aware. So working on a 6X version of that filterer would seem the most logical path for now, and as I said, eventually try and port it down to other version if needed.

Now im not sure if best way is to set it in the parser, or if it would be faster to build another tool, a filterer, that would just clean those data. Since im no way close to understanding most of that programmmation stuff, you would know what's easier for you.

IzFrost commented 9 years ago

I realized after my initial comment on the TC forums that the problem of sensitive data would need to be addressed in the sniffer and not the WPP. But DDuarte's idea of a .pkt filter to run sniffs through before submitting is not a half bad idea. That way users can feel a bit more secure about giving you usable information without unneeded personal info tacked on as well. I myself considered pre-parsing, then using filters in Notepad++ to manually sanitize, zipping and then submitting, but this seemed needlessly time-consuming, and I also did not want to run the risk of accidentally deleting or modifying information you might need.

DDuarte commented 9 years ago

Making the sniffer do the filtering or allowing submissions of modified .txts is not an option.

danlapps commented 9 years ago

Ok so that means either build a stand alone filterer that would clean the specified data, without needing any config, or add a dump option in the parser, not sure which is A) easier to code and B) easier to support

DDuarte commented 9 years ago

It's the same thing, I'm planning to split WPP in multiple smaller parts. Anyway, the filtering code would be done in this repository.

Filtering might not be the right word, I'm thinking about modifying.

Instead of removing the data, we rewrite it to something fictional. If the player name is "John" we write "Qwer" instead. Something random but it must not collide with anything else: if we rename "John" to "Qwer", "Qwer" can only appear where "John" was. Perhaps in strings we should keep the length of the new names the same because of all the bit fields with string length (although those could be changed as well).

I'm not sure yet how to mark/tag places where a packet field must be modified. If only there was Attributes for functions calls, i.e

packet.ReadInt32("SomethingMeaningless1");
[Privacy] packet.ReadString("PlayerName");
packet.ReadByte("SomethingMeaningless2");

Maybe the only way is to do more read overloads... any suggestion?

funjoker commented 4 years ago

Instead of removing it from parses, I add the possibility to sanitize the pkt file and dump it. The pkt won’t contain any personal data then (see sanitizing branch currently)

funjoker commented 4 years ago

if someone is willing:

sanitizing branch

check it out and test