dcposch / scramble

Secure email for everyone
http://dcposch.github.io/scramble/
226 stars 32 forks source link

Use Emscripten to replace OpenPGP.js? #25

Closed dcposch closed 10 years ago

dcposch commented 11 years ago

Decryption with OpenPGP.js is slow. Running single threaded (ie, no Web Workers) on a Core i7, it takes ~60 milliseconds to decrypt a single subject. For a large inbox this leads to unacceptable lag.

OpenPGP.js has other aspects that make me question its quality. For example, attempting decryption with an invalid key results in a cryptic unrelated error message. Few comments. Blocks of commented out code and commented out debugging print calls.

If possible, it would be nice to use a more widely-used OpenPGP implementation such as GPG (libgcrypt) via Emscripten. This may be safer and also faster.

maxtaco commented 11 years ago

Hi, I'm really curious, does this approach seem promising?

dcposch commented 11 years ago

Won't know until we try it!

The main obstacle is that OpenPGP is a command line tool, not as library with a clean interface. There are libraries like libgcrypt that are part of it but those are lower level.

Thoughts?

On Thu, Sep 26, 2013 at 8:28 AM, Maxwell Krohn notifications@github.comwrote:

Hi, I'm really curious, does this approach seem promising?

— Reply to this email directly or view it on GitHubhttps://github.com/dcposch/scramble/issues/25#issuecomment-25177209 .

maxtaco commented 11 years ago

Right, I was thinking the same thing. The annoying thing is that libgcrypt is too low-level to really care about. There are other JS libraries of reasonable quality to provide the same features (hash algos, public key algos, ciphers, etc). The good stuff (parsing the different message formats, etc) seems pretty bound up in the gnupgp client, as you say.

jaekwon commented 11 years ago

It's hard to compile any openpgp implementations in Emscripten because most of them require openssl or whatnot, and that gets pretty complicated (for me at least).

Most of the slowness is in the RSA decryption to get the session key. The AES that follows is really fast, so if we just replace the RSA decryption component of openpgp.js, it'll solve the performance problem. Specifically, the line var xp = m.mod(p).modPow(d.mod(p.subtract(BigInteger.ONE)), p); is slow because modPow() is slow.

I tried compiling http://acme.com/software/bigint/ via Emscripten to test out its mod-power function. Here's how you do it: https://gist.github.com/jaekwon/6881680. Unfortunately, the results aren't pretty. The compiled mod-power function is significantly slower than the one included in openpgp.js. Perhaps I'll get better results if I use other/montexpo.c instead of the default mod-exp function... I may or may not try this myself.

TLDR; I'm not so sure that Emscripten will fix our decryption performance woes. We may want to consider alternatives like NativeClient or reducing the RSA size to 1024 for subject lines.

other links

maxtaco commented 11 years ago

@jaekwon have you tried changing

BigInteger.prototype.am = am3;
dbits = 28;

in jsbn.js? By default, you wind up with the 26-bit representation. On my machine/browser at least the 28-bit representation is faster for modular exponentiations.

jaekwon commented 11 years ago

@maxtaco It already figures out to use am3 for my computer.

jaekwon commented 11 years ago

This is odd.

http://jsperf.com/native-vs-typed-js-array-speed/29

Why is Chrome29 so much slower than Chrome28?

http://jsperf.com/native-vs-typed-js-array-speed/38

Why is Firefox so much faster than Chrome(?!) Does this have anything to do with asm.js? Hmm.

More links:

jaekwon commented 10 years ago

I was wrong to test emscripten without comparing with a native run. Turns out the algorithm I ran was just slow. I'll give another shot with libgcrypt.

https://github.com/mnaamani/crypto-emscripten

jaekwon commented 10 years ago

More learnings:

I think a lot of these super-fast modular-exponentiation libraries are fast because of fine tuned assembly code or maybe even CPU acceleration. (like Intel's IPP).

openssl speed rsa2048 shows that it performs 628.4 signs per second (which is equivalent to our decryption) on my machine, so that's about 1.6 ms as compared to openpgp.js's 60ms. Python's own non-accelerated bigint library is actually comparable to the speed we get from openpgp.js, though if you use gmpy (libgmp) which definitely has assembly code optimizations and probably utilizes accelerated instructions on my iMac intel processor, you get the same 1.6 ms speed.

Python's modular pow() function for RSA2048 uses HAC Algorithm 14.82, which may not be the fastest one, but probably isn't going to improve by an order of magnitude.

TLDR; openpgp.js performance probably can't be improved by switching over to another implementation.

maxtaco commented 10 years ago

One other idea for OpenPGP.js is to use CRT for signatures. It's already being used for decryptions.

jaekwon commented 10 years ago

I don't know much about RSA-CRT. Does OpenPGP.js support it? Where is it being used for decryptions?

Also I should note that the listing the subject lines in Scramble.io doesn't require signature checks, although it does once you click through to see the message body.

maxtaco commented 10 years ago

CRT is short for "Chinese Remainder Theorem." If one knows p and q, one can compute y^d modulo p and modulo q, and then combine the results with CRT. It yields the same answer as the naive approach but is much faster since the sizes of p and q are half that of n. Check out rsa.js in OpenPGP.js. It's already being done for decrypt but not for sign.

hiddentao commented 10 years ago

Here is a full port of the OpenPGP command-line tool using Emscripten - https://github.com/manuels/unix-toolbox.js-gnupg. Total size of the ported, minified JS is >3MB !

dcposch commented 10 years ago

Cool! We'll check it out.

On Sat, Nov 2, 2013 at 6:58 PM, Ramesh Nair notifications@github.comwrote:

Here is a full port of the OpenPGP command-line tool using Emscripten - https://github.com/manuels/unix-toolbox.js-gnupg. Total size of the ported, minified JS is >3MB !

— Reply to this email directly or view it on GitHubhttps://github.com/dcposch/scramble/issues/25#issuecomment-27637116 .

hiddentao commented 10 years ago

@dcposch It loads the entire ported script into a Web worker, which is nice. Also, there is an online demo at: http://manuels.github.io/unix-toolbox.js-gnupg/

nick7ikin commented 10 years ago

Have you tried demo of gnupg from unix-toolbox? Keys generation doesn't work for me:

$ gpg2 --lock-never --batch --gen-key /input.txt
note: random_seed file is empty
Out of entropy!
Fatal: read error on random device: Input/output error

I've tried to recompile that project but also failed, have you managed to ?

hiddentao commented 10 years ago

You need to create a file at /dev/egd-pool inside the virtual filesystem used by the emscripten worker and then put some random data (say 4KB?) into it - that will ensure entropy is always available. I hope to soon put together a stand-alone version of the GPG web worker for general use.

nick7ikin commented 10 years ago

I see, but this file /dev/egd-pool is created automatically in demo.js:

  var rand = new Uint8Array(4096);
  window.crypto.getRandomValues(rand);
  gpg.addData(rand, '/dev/egd-pool');

The issue occurs when reading this file in gpg2-worker.js:

        var rand_contents = FS.root.contents['dev'].contents['egd-pool'].contents;
        var rand = new Uint8Array(rand_contents);

in this code rand_contents - isn't an array - it is an object with [1], [2] members (like arguments variable in function). So new operator return undefined

hiddentao commented 10 years ago

Well I was able to get it working on mine - have you checked the output of the window.crypto.getRandomValues() call?

hiddentao commented 10 years ago

Another thing to note is that each time you make a call to the worker you need to reinitialise the file system. Which means after each call you should save the contents of whatever files you want to save into memory so that you can then re-create the filesystem as it was, ready for your next call. The online demo does this.

nick7ikin commented 10 years ago

Yes, window.crypto.getRandomValues() returns Uint8Array with random values. Actually i fixed this error on my local copy of demo, but got another error :). It seems that demo may work differently in different browsers (i'm using chrome Version 33.0.1750.149).

What do you think about performance of this solution? Is it nice comparing to native library performance? I'm asking cause some guys in this thread were worrying about this approach.

hiddentao commented 10 years ago

Actually I prefer this to OpenPGP.js for now because it's more feature complete - GPG2 is a mature library and tool that has been under development for years. Plus running it in a web worker makes more sense in my opinion as it's just a bunch of maths calculations which don't need access to the network or the DOM.

However, the worker does weigh in at 3MB, which is a big hit on your initial page download time. I haven't tested its speed yet against native GPG2 as I haven't yet reached the encryption part of things. I think it should run faster in Firefox than in Chrome because it uses asm.js.

nick7ikin commented 10 years ago

thank you for the feedback. I also think that emscripten approach is quite promising, in spite of lots of issues I've got trying to recompile GnuPG. Actually I still failed to compile GnuPG to *.js by myself. I think it's because of lack of experience in cross-compilation.

dcposch commented 10 years ago

Yeah, emulating an entire shell with gnupg in it seems super hacky.

My new thought is: start with OpenPGP.js or a cleaner fork of OpenPGPjs like kbpgp. Find only the slow parts. Use asm.js over an single C function (eg for modular exponentiation or something like that) to speed up just that part. Leave everything else (armoring, dearmoring, etc) in JS.

mnaamani commented 9 years ago

I worked on compiling libgcrypt and libotr in emscripten overriding some compiled code with bigint.js to speed things up, maybe someone will find it useful:

https://github.com/mnaamani/crypto-emscripten