Open DDuarte opened 9 years ago
This is completely pointless, that data is still in .pkt files. All WSTC does is write the data as a single binary blob (it will never attempt to parse packets to mess with data - it is a dumper after all)
Aren't most of the submitted data straight from the sniffer ? Not many person submit already parsed packets. So although I like the idea I think it should be from the sniffer point of view if you want a personal security.
A possible way to do this is create a new DumpFormatType that takes the original .pkt and outputs a modified .pkt without sensible information. You would then be able to submit this modified .pkt to us.
'Sensible information' is slightly subjective so I need your help to classify and tag it:
Another issue is that it's hard to find the info to remove when we do not parse the packets. It's not a problem in 6.0.3 since we have nearly 100% of successful parses but it might be a problem in older or newer versions.
Well older version requires less sniffing since there are no "retail" version for verified sniffs anyways. I would focus this dev on 6x and eventually port to older later if possible.
Before anyone tries to implement this do explain your idea first (I don't want anyone to do a 10k lines of code pull request for it to be denied.)
I will write a more complete recap of what I think tonight when I get on either laptop or computer. Right now I'm handling kids and friends so not ideal for a clear explanation !
So as I was saying, I think this is a great idea. However as you said, the submitted file would need to be another .pkt so you guys can parse it as you need. I think Nayd filtered much of the "personnal or sensible data", maybe some Guild info, like name and roster if they are in any packets, shooting in the dark here, not sure what is parsed. Those could be replaced with generic names "playername" or "homeip" or whatever.
As for the version, I dont really know how people could sniff data for either 3.3.5 or 4.3.4 from any "retail" source, unless some coutry are locked on those that im not aware. So working on a 6X version of that filterer would seem the most logical path for now, and as I said, eventually try and port it down to other version if needed.
Now im not sure if best way is to set it in the parser, or if it would be faster to build another tool, a filterer, that would just clean those data. Since im no way close to understanding most of that programmmation stuff, you would know what's easier for you.
I realized after my initial comment on the TC forums that the problem of sensitive data would need to be addressed in the sniffer and not the WPP. But DDuarte's idea of a .pkt filter to run sniffs through before submitting is not a half bad idea. That way users can feel a bit more secure about giving you usable information without unneeded personal info tacked on as well. I myself considered pre-parsing, then using filters in Notepad++ to manually sanitize, zipping and then submitting, but this seemed needlessly time-consuming, and I also did not want to run the risk of accidentally deleting or modifying information you might need.
Making the sniffer do the filtering or allowing submissions of modified .txts is not an option.
Ok so that means either build a stand alone filterer that would clean the specified data, without needing any config, or add a dump option in the parser, not sure which is A) easier to code and B) easier to support
It's the same thing, I'm planning to split WPP in multiple smaller parts. Anyway, the filtering code would be done in this repository.
Filtering might not be the right word, I'm thinking about modifying.
Instead of removing the data, we rewrite it to something fictional. If the player name is "John" we write "Qwer" instead. Something random but it must not collide with anything else: if we rename "John" to "Qwer", "Qwer" can only appear where "John" was. Perhaps in strings we should keep the length of the new names the same because of all the bit fields with string length (although those could be changed as well).
I'm not sure yet how to mark/tag places where a packet field must be modified. If only there was Attributes for functions calls, i.e
packet.ReadInt32("SomethingMeaningless1");
[Privacy] packet.ReadString("PlayerName");
packet.ReadByte("SomethingMeaningless2");
Maybe the only way is to do more read overloads... any suggestion?
Instead of removing it from parses, I add the possibility to sanitize the pkt file and dump it. The pkt won’t contain any personal data then (see sanitizing branch currently)
if someone is willing:
sanitizing branch
check it out and test
By @lfrost on http://www.trinitycore.org/f/topic/11075-suggestion-sanitize-personal-info-from-packet-parses/