vmonaco / kloak

Keystroke-level online anonymization kernel: obfuscates typing behavior at the device level.
BSD 3-Clause "New" or "Revised" License
491 stars 36 forks source link

kloak status #10

Open HulaHoopWhonix opened 6 years ago

HulaHoopWhonix commented 6 years ago

Hi, just pinging you about the status of this project. An eager user asked us about it :)

vmonaco commented 6 years ago

Glad there's still interest, I will start to make this a priority. I just moved across country and transitioned to a new position, so productivity has been low lately, but that will change soon. I'll try to reproduce #1 next week. Thanks for checking in.

HulaHoopWhonix commented 6 years ago

Awesome and congrats on your new job

adrelanos commented 6 years ago

There certainly is interest! One of my most awaited software. Looking forward to add this as soon to Whonix as possible. An internet miscommunication. I guess I am refraining from bothering people I think are busy with other tasks when then got interpreted as non-interest. Learned something. :)

vmonaco commented 6 years ago

It's no bother at all :) Sorry to have put this off for so long. I do have big plans in terms of obfuscating user behavior, not just keystroke biometrics. Also looking at how to obfuscate actions, e.g., temporal keylogging (see my '18 S&P paper). I'll have more bandwidth for this stuff in the new job.

HulaHoopWhonix commented 6 years ago

The most comprehensive paper on the topic I've ever seen. Impressive and made a great read.

PowerPress commented 6 years ago

Really looking forward to all your enhancements!

vmonaco commented 5 years ago

Sorry again for the hiatus. I've been working on this over the past couple weeks and recently made a number of improvements:

Note that these changes still don't currently exclude the possibility of identifying kloak users from non-kloak users. I'm currently working on this - I think a potential solution is to spoof (another user or operating system) instead of obfuscation. More info to follow

HulaHoopWhonix commented 5 years ago

Thanks for the update. Ready when you are :-) We are still very eager about Kloak and the need that it fills.

PowerPress commented 5 years ago

Curious how you would spoof an OS? Are you meaning the network stack?

vmonaco commented 5 years ago

Should be good to go for testing. I updated the docs and just uploaded a deb to releases.

I tested on the latest Whonix in virtualbox and Ubuntu on bare metal with no issues.

The changelog still needs to be updated (and the package version?). I'm not entirely sure about the format of these files, so I'll refer to @adrelanos.

As for OS spoofing: I have some work in progress that indicates OS family and version can be determined from key event timings. This has to do with the global system clock (if any is used) and the way the scheduler handles interrupts. So, a website could fingerprint your host based on DOM input event timestamps, which defeats other methods of obfuscation, such as spoofing user agent string. I plan to address this issue in a future release of kloak after fleshing out the attack.

adrelanos commented 5 years ago

The changelog still needs to be updated (and the package version?). I'm not entirely sure about the format of these files, so I'll refer to @adrelanos.

make uch (upstream changelog) which basically is just doing git log > changelog.upstream.

Packages without upstream changelog cause a lintian --pendantic warning. I just implemented changelog.upstream for perfection sake to eliminate this last lintian warning with a reasonable, acceptable implementation. Since Debian packages upstreams, I don't think there is a standard or convention for upstream changelogs. Unless you like to provide a hand typed (or some sort of fancy git log or similar command) changelog.upstream, I think git log is a good stopgap.

(Btw later for next version make deb-uachl-bumpup-major to increase debian/changelog which is also just a shortcut to call debchange.)

adrelanos commented 5 years ago

https://twitter.com/Whonix/status/1111053743905226752

HulaHoopWhonix commented 5 years ago

Package installed and service ran without a hitch. Trained Kloak / authenticated Kloak gives accuracy of 42%! :-D

I'll be soon posting results from another team member for the 2 other tests with normal training which I avoided for anonymity reasons.

Thanks for the incredible effort and dedication Vinnie.

vmonaco commented 5 years ago

@adrelanos Thanks, I should have known to check the other genmkfile targets.

@HulaHoopWhonix Glad it works, and sorry I didn't get to this sooner! More updates to come this year as I work on a method to obfuscate mouse behavior.

HulaHoopWhonix commented 5 years ago

@vmonaco Thanks again, can't wait to see the great things in store. :)

OK here's the results. He did three trials for each scenario for confirmation.

train normal / auth normal

trial (1) 94% accuracy identified trial (2) 92% accuracy trial (3) 94% ..

train normal / auth kloak

trial 1: 18% trial 2: 15% trial 3: 19%

train kloak / auth kloak

trial 1: 40% trial 2: 42% trial 3 36%

vmonaco commented 5 years ago

Nice results.

I suspect that while kloak definitely obfuscates typing behavior, making it difficult to authenticate or identify a particular user, users running kloak may look "similar" to other users running kloak. That is, it might be possible to identify kloak users from non-kloak users. If this is the case, the anonymity set will increase as more users start running kloak.

adrelanos commented 5 years ago

kloak - Keystroke-level online anonymization kernel: obfuscates typing behavior at the device level - Testers Wanted!

https://forums.whonix.org/t/kloak-keystroke-level-online-anonymization-kernel-obfuscates-typing-behavior-at-the-device-level-testers-wanted/7089

https://twitter.com/Whonix/status/1113071411025928192

https://www.facebook.com/Whonix/photos/a.1138314816210772/2618285614880344

HulaHoopWhonix commented 5 years ago

With kloak running concurrently on both the host and VM I get an even better result of <10% accuracy with train kloak / auth kloak (when testing from within VM).

Testing longer paragraphs yields same results.

HulaHoopWhonix commented 5 years ago

keytrac.net recently switched to longer text paragraphs for authentication only. Here are the test results from someone in the Whonix team. @vmonaco what do you make of the accuracy level of "train kloak/test kloak"?

Train normal, test kloak

Test 1: 06% accuracy Test 2: 08% ... Test 3: 12% .

Train kloak, test kloak

Test 1: 75% accuracy Test 2: 70% ... Test 3: 73% .

Train normal, test normal

Test 1: 98% accuracy Test 2: 96% ... Test 3: 96% .

vmonaco commented 5 years ago

Thanks for letting me know. It looks like keytrac also underwent some rebranding since I last checked.

Re. the result above, the relative high accuracy in the train kloak/test kloak scenario (compared to others that have tested) highlights one of the current limitations of kloak: it obfuscates your actual typing behavior (achieves low accuracy in the train normal/test kloak scenario), but does not attempt to make two different kloak sessions from the same user look different. Using kloak, typing behavior starts to look more like white noise. But, it does this for everyone, so two different kloak users will both start to look like this white noise. That is, as more people use kloak, the size of the anonymity set will grow.

So currently, kloak currently tries to: obfuscate your own behavior, and make everyone look similar (can't differentiate between kloak users). I have some ideas for other privacy objectives, such as obfuscating your own behavior and make everyone look different. This could be done by spoofing another (made up) identity, which would make it difficult to detect kloak vs non-kloak users. This spoofed identity could change over time, say at each login.

With that said, the default max delay of 100 ms might not be the best option for everyone. This really depends on typing speed - slower typists should use a larger max delay. A to do item is dynamically adjust the max delay to typing speed.

Edit: can differentiate -> can't differentiate

adrelanos commented 5 years ago

Tor Project gave up on making users appear different across different sessions. Instead, they attempt to put all Tor Browser users into the same anonymity set. (Or multiple sets according to security slider settings.) Dunno if this would apply here too.

Everyone looking same, everyone "looking kloak" might be sufficient. Better than, i.e. a made up identity, may not be possible or worth it?

With that said, the default max delay of 100 ms might not be the best option for everyone. This really depends on typing speed - slower typists should use a larger max delay. A to do item is dynamically adjust the max delay to typing speed.

That sounds great! Perhaps 3-4 (as much as needed) anonymity sets for different speeds of typists?

HulaHoopWhonix commented 5 years ago

@adrelanos If everybody looks kloak, but uniquely differ from each other and their style with kloak is the same across all sessions, you would have pseudonymous typing patterns. If a user types with kloak once non-anonymously, an adversary with stored patterns can go back and link all texts typed by the same person.

adrelanos commented 5 years ago

If everybody looks kloak, but uniquely differ from each other and their style with kloak is the same across all sessions, you would have pseudonymous typing patterns.

That would be bad indeed.

That btw not what I meant, I think. What I meant to say is "If everyone is looking the same, if everyone looking kloak without uniquely identifiable pseudonym, then that's not that bad." That's what we are used to with Tor Browser too.

But I indeed overlooked something important here. What if someone uses kloak non-anonymously first and anonymously later (or vice versa). Since the number of kloak users will be initially low, data harvesters could just guess (give a probability) that it's the same person. Could we derive a recommendation "don't ever use kloak non-anonymously, only use kloak anonymously" from that?

In this context...

This could be done by spoofing another (made up) identity, which would make it difficult to detect kloak vs non-kloak users.

In this context I somehow doubt that's possible. Suppose someone is using a standard browser like most people are doing nowadays and is completely tracked by cookies (or similar tracking technology for the sake of argument). If the typing fingerprint changes all the time to another made up identity, then that is quite unlikely from the perspective of the data harvester and the data harvester would more likely conclude "user of kloak".

This spoofed identity could change over time, say at each login.

If going for this: why not change the spoofed identity all the time, why only at some to be specified trigger (such as login)?

vmonaco commented 5 years ago

That sounds great! Perhaps 3-4 (as much as needed) anonymity sets for different speeds of typists?

That btw not what I meant, I think. What I meant to say is "If everyone is looking the same, if everyone looking kloak without uniquely identifiable pseudonym, then that's not that bad." That's what we are used to with Tor Browser too.

Yes, I think that's a good approach. Presumably, most kloak users are Tor users, so having a similar anonymity model makes sense. In this case, the "slider" would be the max delay setting, choosing a higher value to be part of a stronger anonymity set.

But I indeed overlooked something important here. What if someone uses kloak non-anonymously first and anonymously later (or vice versa). Since the number of kloak users will be initially low, data harvesters could just guess (give a probability) that it's the same person. Could we derive a recommendation "don't ever use kloak non-anonymously, only use kloak anonymously" from that?

Since kloak tries to give everyone the "same pseudonym", this is certainly a concern. This is one motivation for making that pseudonym a moving target. With a low number of kloak users, it's also a concern that the small anonymity enables tracking kloak users, assuming it's easy to identify kloak vs non-kloak users (which I think it is). But from your comments above, placing all users in the same anonymity set and making the recommendation to only use kloak anonymously seems like a good compromise.

In this context I somehow doubt that's possible. Suppose someone is using a standard browser like most people are doing nowadays and is completely tracked by cookies (or similar tracking technology for the sake of argument). If the typing fingerprint changes all the time to another made up identity, then that is quite unlikely from the perspective of the data harvester and the data harvester would more likely conclude "user of kloak".

If going for this: why not change the spoofed identity all the time, why only at some to be specified trigger (such as login)?

Yep, the change could be continuous. But probably with the same caveat you point out above (an indication of someone using the tool).

Thinking generally: ideally, a behavior obfuscation tool like kloak would make it difficult to differentiate between:

  1. My obfuscated self and my true self
  2. My obfuscated self in two different sessions
  3. Two different obfuscated users
  4. Tool users and non-users

The challenge is, some of these objectives are competing (potentially 3 and 4) and make other assumptions (if 4 can't be achieved, 3 assumes all users agree to be in the same anonymity set).

There are some other recent works on obfuscating behavior, like authorship (https://www.usenix.org/conference/usenixsecurity18/presentation/shetty). Compared to anonymizing networks, I don't think we yet have a good framework for reasoning about these methods (I'm working on this!).

vmonaco commented 4 years ago

FYI, here's another use case for kloak: info leaked through network traffic induced by keystrokes.

adrelanos commented 4 years ago

Happy to announce that kloak is installed by default for all users of Non-Qubes-Whonix.

(Qubes-Whonix issue: https://github.com/QubesOS/qubes-issues/issues/2558)

Documented here: https://www.whonix.org/wiki/Keystroke_Deanonymization

PowerPress commented 4 years ago

Is there any chance the Qubes version will get support for this awesome tool? Definitely would be useful for Tails as well.

On Thu, Sep 12, 2019 at 7:22 AM Patrick Schleizer notifications@github.com wrote:

Happy to announce that kloak is installed by default for all users of Non-Qubes-Whonix https://www.whonix.org/wiki/Non-Qubes-Whonix.

(Qubes-Whonix issue: QubesOS/qubes-issues#2558 https://github.com/QubesOS/qubes-issues/issues/2558)

Documented here: https://www.whonix.org/wiki/Keystroke_Deanonymization

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/vmonaco/kloak/issues/10?email_source=notifications&email_token=ABQHJ322KUCA3AS6YFSJHHLQJIX6RA5CNFSM4FOZEPY2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6RV3BA#issuecomment-530800004, or mute the thread https://github.com/notifications/unsubscribe-auth/ABQHJ36CLUJA73WIMC7SWEDQJIX6RANCNFSM4FOZEPYQ .

adrelanos commented 1 year ago

The ticket for Qubes-Whonix is here:

Is Kloak available for Qubes-Whonix?

adrelanos commented 5 months ago
siliconwaffle commented 5 months ago

Has kloak been abandoned? I noticed @vmonaco has been pretty much completely inactive from Github for ~5 months now, and this repo hasn't been touched for ~6 months. @adrelanos are you or someone else with Whonix still maintaining kloak independent of @vmonaco? I want to submit a kloak package for Fedora, should I just track Whonix's kloak repo instead of this one?

adrelanos commented 5 months ago

@adrelanos are you or someone else with Whonix still maintaining kloak independent of @vmonaco?

No. We need an active upstream to review C(++) code or make a review in Rust or Python or something.

siliconwaffle commented 3 months ago

No. We need an active upstream to review C(++) code or make a review in Rust or Python or something.

As much as I hate to say it, Kloak does appear abandoned. What does this mean for Whonix and Kloak's inclusion in Whonix? Will it be replaced by a fork or a different project, kept in an unmaintained state, or eventually removed entirely?

vmonaco commented 3 months ago

hey folks, sorry for the hiatus, haven't had much bandwidth recently to work on kloak. I'll carve out some cycles within the next week or so to catch up on PRs!

siliconwaffle commented 3 months ago

hey folks, sorry for the hiatus, haven't had much bandwidth recently to work on kloak. I'll carve out some cycles within the next week or so to catch up on PRs!

Glad to have you back. I want to let you know I have packaged Kloak as an rpm and I intend to contribute it to Fedora within the next couple of weeks.