laurent22 / joplin

Joplin - the privacy-focused note taking app with sync capabilities for Windows, macOS, Linux, Android and iOS.
https://joplinapp.org
Other
45.04k stars 4.9k forks source link

Investigate why decryption is slower #8619

Open laurent22 opened 1 year ago

laurent22 commented 1 year ago

It seems to be slower when decrypting on mobile (tested on Android), maybe due to the recent change to AES-256.

Probably we need to test with resources for various sizes on the Android emulator, see how long it takes to decrypt. Changing the chunk size might help:

https://github.com/laurent22/joplin/blob/16d8a78d8a614a07886a99f696a6efaa76435f01/packages/lib/services/e2ee/EncryptionService.ts#L58

Ref: https://discourse.joplinapp.org/t/decryption-is-so-slow-on-android-2-minutes-per-1mb-note-that-it-makes-e2ee-unusable/32072/1

personalizedrefrigerator commented 1 year ago

Here's what I've found so far:

wh201906 commented 1 year ago
  • If we want to switch to a native implementation...

How about libsodium?

personalizedrefrigerator commented 1 year ago

How about libsodium?

I think libsodium would also require a switch to AES GCM: https://github.com/jedisct1/libsodium/issues/739

wh201906 commented 1 year ago
  • sjcl.decrypt accepts a JSON string for cyphertext and sjcl may be internally creating an ArrayBuffer.

I skimmed the code and it looks like the files are encoded as base64 string first, then encrypted. I guess the conversion to base64 might slow down the process, and take extra storage.

laurent22 commented 1 year ago

Switching to a different lib would be for a future version since it's likely to be a lot of work.

For 2.12 either we find a way to make it fast again using sjcl, or we go back to AES-128. The current implementation may be more secure but unfortunately it's not usable.

Does changing the chunk size help?

personalizedrefrigerator commented 1 year ago

It looks like larger chunk sizes generally result in faster decryption. Tests were run on an Android x86_64 API 33 (Android 14) emulator with 4 GiB of RAM. Encrypted data was 1 MiB and varied between randomly selected ASCII and unicode characters.

Averaging all trials (including the ASCII-only trials), SJCL1a is on average about 1.2× faster than SJCL1b. Similarly, a chunk size of 40,000 seems to be roughly 1.18× faster than a chunk size of 5,000.

Unicode tests only: Effect of chunk size on decryption time (unicode) (Trial 1 and trial 2 were performed with SJCL1b).

Unicode and ASCII tests combined: Effect of chunk size on decryption time

Data https://docs.google.com/spreadsheets/d/1L_VNLFptuM8siPyj1QbsJRNb9ycGKliTCvYAvwzsgCw/edit?usp=sharing [Joplin Encryption Performance - Sheet2.csv](https://github.com/laurent22/joplin/files/12304441/Joplin.Encryption.Performance.-.Sheet2.csv)
Patch used to generate the above data ```diff diff --git a/packages/app-mobile/components/screens/encryption-config.tsx b/packages/app-mobile/components/screens/encryption-config.tsx index 7c9e4603a..fa886fd9e 100644 --- a/packages/app-mobile/components/screens/encryption-config.tsx +++ b/packages/app-mobile/components/screens/encryption-config.tsx @@ -5,15 +5,15 @@ import ScreenHeader from '../ScreenHeader'; const { themeStyle } = require('../global-style.js'); const DialogBox = require('react-native-dialogbox').default; const { dialogs } = require('../../utils/dialogs.js'); -import EncryptionService from '@joplin/lib/services/e2ee/EncryptionService'; +import EncryptionService, { EncryptionMethod } from '@joplin/lib/services/e2ee/EncryptionService'; import { _ } from '@joplin/lib/locale'; import time from '@joplin/lib/time'; import { decryptedStatText, enableEncryptionConfirmationMessages, onSavePasswordClick, useInputMasterPassword, useInputPasswords, usePasswordChecker, useStats } from '@joplin/lib/components/EncryptionConfigScreen/utils'; import { MasterKeyEntity } from '@joplin/lib/services/e2ee/types'; import { State } from '@joplin/lib/reducer'; import { SyncInfo } from '@joplin/lib/services/synchronizer/syncInfoUtils'; -import { getDefaultMasterKey, setupAndDisableEncryption, toggleAndSetupEncryption } from '@joplin/lib/services/e2ee/utils'; -import { useMemo, useRef, useState } from 'react'; +import { getDefaultMasterKey, loadMasterKeysFromSettings, setupAndDisableEncryption, toggleAndSetupEncryption } from '@joplin/lib/services/e2ee/utils'; +import { useCallback, useMemo, useRef, useState } from 'react'; interface Props { themeId: any; @@ -282,10 +282,96 @@ const EncryptionConfigScreen = (props: Props) => { ) : null; + const [status, setStatus] = useState('Test'); + const onTestEncryptionSpeed = useCallback(async () => { + const makeSourceData = (size: number, minCharCode: number, maxCharCode: number) => { + const result = []; + + for (let i = 0; i < size; i ++) { + const charCode = Math.round(Math.random() * (maxCharCode - minCharCode)) + minCharCode; + result.push(String.fromCodePoint(charCode)); + } + + // Return the first size characters -- fromCodePoint can return multiple characters. + return result.join('').substring(0, size); + }; + // See https://en.wikipedia.org/wiki/List_of_Unicode_characters and + // https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/fromCodePoint + const asciiSourceData = () => makeSourceData(1024 * 1024, 32, 126); + const unicodeSourceData = () => makeSourceData(1024 * 1024, 0, 0x10FFFF); + + const encryptionService = new EncryptionService(); + await loadMasterKeysFromSettings(encryptionService); + + // maps chunk sizes to trial time deltas. + const decryptData: Record = {}; + const encryptData: Record = {}; + + const trials = [ 1, 2, 3, 4 ]; + const chunkSizes = [ 1_000, 2_000, 3_000, 4_000, 5_000, 6_000, 7_000, 8_000, 9_000, 10_000, 20_000, 40_000 ]; + + for (const trial of trials) { + console.log('trial', trial); + + let sourceData = unicodeSourceData(); + if (trial > 2) { + sourceData = asciiSourceData(); + } + + for (const chunkSize of chunkSizes) { + let timeStart, timeEnd; + + setStatus(`Encrypting with chunk size ${chunkSize} (${sourceData.length / 1024 / 1024} MiB)`); + (encryptionService as any).chunkSize_ = chunkSize; + + // Time encryption + timeStart = performance.now(); + const encrypted = await encryptionService.encryptString(sourceData, { encryptionMethod: EncryptionMethod.SJCL1b }); + timeEnd = performance.now(); + + encryptData[chunkSize] ??= []; + encryptData[chunkSize].push(timeEnd - timeStart); + + setStatus(`Decrypting with chunk size ${chunkSize} (${sourceData.length / 1024 / 1024} MiB)`); + + // Time decryption + timeStart = performance.now(); + const decrypted = await encryptionService.decryptString(encrypted); + timeEnd = performance.now(); + + decryptData[chunkSize] ??= []; + decryptData[chunkSize].push(timeEnd - timeStart); + + // Verify that decryption was successful + if (decrypted !== sourceData) { + console.log(decrypted.length, sourceData.length); + throw new Error('not one-to-one'); + } + } + } + + const printResult = (label: string, data: Record) => { + const result = [ + label, + 'Chunk size,' + trials.map(trial => `Trial ${trial} (ms)`).join(',') + ]; + + for (const size of chunkSizes) { + result.push([ size, ...data[size] ].join(',')); + } + + console.log(result.join('\n')); + }; + printResult('Encryption time', encryptData); + printResult('Decryption time', decryptData); + setStatus('Test'); + }, []); + return ( +

Interestingly, this conclusion is the opposite of what is mentioned in the code: https://github.com/laurent22/joplin/blob/90d75ce80e490cf5cabb4f3c1b0d9917d7d59872/packages/lib/services/e2ee/EncryptionService.ts#L58-L70

laurent22 commented 1 year ago

Thanks for these detailed tests but somehow 1.2x difference is not what some users (including myself) are observing. It's more like 5 or 10 times slower (or more) at least with certain resources. I'll see if I can replicate this slow down, so that it can be compared with the previous decryption method

personalizedrefrigerator commented 1 year ago

5 or 10 times slower (or more) at least with certain resources

In that case, the decryption speed might be caused by memory usage (e.g. forcing the device to use swap instead of RAM). When running the tests above, roughly 87% of memory is being used on the emulator:

$ adb shell top -h
Tasks: 333 total,   1 running, 332 sleeping,   0 stopped,   0 zombie
Mem[||||||||||||||||||||||||||||||87.2] Swp[||                             3.4]
400%cpu  88%user   1%nice   4%sys 305%idle   0%iow   1%irq   0%sirq   1%host
  PID USER         %CPU [%MEM]CMDLINE                                           
 3328 u0_a169      90.0  14.3 net.cozic.joplin
  580 system        0.0   6.9 system_server
 1700 u0_a110       0.0   6.4 com.google.android.googlequicksearchbox:search
 1244 u0_a104       0.0   6.0 com.google.android.gms.persistent
 1575 u0_a104       0.0   5.9 com.google.android.gms
  781 u0_a145       0.0   5.9 com.android.systemui

If the issue is low memory, I expect that a smaller chunk size would help, as less data would be loaded into memory at a given time. (E.g. maybe a chunk size of 4000?)

I'm retrying the above tests on an emulator with less RAM.

Edit: The results are still similar. However, as above, it still takes roughly 5-10 seconds to encrypt/decrypt a 1 MiB string (for both SJCL1a and SJCL1b). | SJCL1b Encryption time |   |   |   |   -- | -- | -- | -- | -- Chunk size | Trial 1 (unicode) (ms) | Trial 2 (unicode) (ms) | Trial 3 (ascii) (ms) | Trial 4 (ascii) (ms) 1000 | 18564.8539079999 | 18277.346659 | 17723.900473 | 17801.368393 2000 | 17894.406786 | 17862.379163 | 8996.02925699996 | 8906.89519000007 3000 | 18395.441261 | 17987.957858 | 5981.33646499994 | 6083.45909300004 4000 | 16938.052499 | 16883.6078359999 | 4642.35189599986 | 4644.10459799995 5000 | 15320.7337999999 | 15620.5936149999 | 6128.20641099988 | 6282.10061399988 6000 | 15226.9517959999 | 14961.224524 | 5944.38508899999 | 5989.30260300008 7000 | 15304.954842 | 15529.4928309999 | 5102.9698069999 | 5054.58456799993 8000 | 16251.063322 | 15742.4350020001 | 4486.7734699999 | 4512.95242500003 9000 | 15047.109433 | 15460.906458 | 4232.05040300009 | 4450.70749400021 10000 | 15017.210311 | 14944.033883 | 4658.24309999985 | 4572.05408299994 20000 | 14511.6565599998 | 14788.1961430002 | 4016.45287300018 | 4028.45889699995 40000 | 16215.7151810001 | 14563.878361 | 4023.91300599999 | 3982.86041099997 | SJCL1a Encryption time |   |   |   |   -- | -- | -- | -- | -- Chunk size | Trial 1 (unicode) (ms) | Trial 2 (unicode) (ms) | Trial 3 (ascii) (ms) | Trial 4 (ascii) (ms) 1000 | 18570.483279 | 19101.504558 | 17846.728118 | 17859.6135530002 2000 | 18211.537148 | 17944.0507400001 | 8974.01303000003 | 9027.32620200003 3000 | 16059.65492 | 14919.52784 | 6063.27188499994 | 6048.59138800018 4000 | 13371.898827 | 13404.030942 | 4568.11231899983 | 4522.95895699994 5000 | 14234.517108 | 14172.0002 | 3792.80285900016 | 3787.38830700004 6000 | 13394.150049 | 12871.858218 | 5697.72866400005 | 5717.363228 7000 | 12890.573025 | 13007.2482640001 | 5063.99677799991 | 5055.20537899993 8000 | 13475.730764 | 13046.687505 | 4454.55521600018 | 4479.26288400008 9000 | 14288.1567960001 | 12520.292228 | 4062.732969 | 3965.45210899995 10000 | 12622.00197 | 12634.540056 | 3721.501345 | 3611.42750699981 20000 | 12988.1900940001 | 12306.763453 | 3654.36606099992 | 3540.369129 40000 | 12452.484937 | 11837.380479 | 3396.58778299997 | 3263.45747600007 | SJCL1b Decryption time |   |   |   |   |---|---|---|---|---| Chunk size | Trial 1 (unicode) (ms) | Trial 2 (unicode) (ms) | Trial 3 (ascii) (ms) | Trial 4 (ascii) (ms) 1000 | 19422.119026 | 19967.741315 | 17793.9276920001 | 17772.8933889999 2000 | 17857.5445430001 | 18698.324208 | 8931.94554400002 | 8932.69669200014 3000 | 19910.27487 | 18026.1280340001 | 5947.7098999999 | 5986.9332930001 4000 | 17768.614898 | 17903.2512300001 | 4697.89728799998 | 4857.57165500009 5000 | 17893.305652 | 17779.8485079999 | 7061.872126 | 6910.49500700017 6000 | 17400.629494 | 17613.888735 | 5933.39512300002 | 5960.05426599993 7000 | 16606.506427 | 16824.770309 | 5091.57823699992 | 5050.55114999996 8000 | 16257.8279019999 | 16682.0653610001 | 4501.46963599999 | 4495.91692600003 9000 | 15883.0520859999 | 16113.9290470001 | 5057.38571199984 | 5069.08129200013 10000 | 15944.0634560001 | 15992.7022250001 | 5274.45656599989 | 5294.60123399994 20000 | 17689.364181 | 15787.775293 | 4401.776449 | 4437.20373499999 40000 | 17178.661908 | 15628.391508 | 4156.7509039999 | 4105.16272899997 | SJCL1a Decryption time |   |   |   |   |---|---|---|---|---| Chunk size | Trial 1 (unicode) (ms) | Trial 2 (unicode) (ms) | Trial 3 (ascii) (ms) | Trial 4 (ascii) (ms) 1000 | 18688.60682 | 18215.482881 | 18018.359663 | 17927.8026030001 2000 | 17862.117051 | 17920.126408 | 9056.82677099994 | 9058.30493199988 3000 | 17948.757764 | 17829.730902 | 6024.36144799995 | 5956.63461699989 4000 | 15334.452226 | 14063.4689410001 | 4684.79744399991 | 4510.07863000012 5000 | 14414.203916 | 14203.689259 | 4361.71388599998 | 4674.96558700013 6000 | 14732.504972 | 14677.579913 | 5915.80788600002 | 5892.83070899989 7000 | 14664.039532 | 14580.5686239999 | 5056.47859299998 | 5134.79062599991 8000 | 14379.4360930001 | 13791.994237 | 4520.68849399989 | 4528.43296400015 9000 | 13857.3772870001 | 13941.1274850001 | 4061.06725299987 | 3985.68532799999 10000 | 14660.744064 | 13993.52229 | 3852.10415899986 | 3749.65941600013 20000 | 14698.643549 | 13293.9393989999 | 3675.74784500012 | 3635.22211800003 40000 | 13933.384888 | 13313.1167670001 | 3587.37463500001 | 3626.15904899989
laurent22 commented 1 year ago

Thanks, so I guess the situation is that it will get faster as we increase the chunk size, until we fill up the RAM, at which point it will start swapping and becomes extremely slow. And I guess with AES-256 it tends to reach more quickly that limit.

I've just tried it and unfortunately I can't seem to replicate the same slow down I have on device. I think we're going to go back to AES-128 for 2.12 until we better understand the issue and how to tweak the encryption for performance.

tomasz1986 commented 1 year ago

until we fill up the RAM, at which point it will start swapping and becomes extremely slow.

Android doesn't use swap (except for something like zram or zswap on some devices), so if the RAM gets filled, the app will likely be just killed by the OS.

wh201906 commented 1 year ago

When running the tests above, roughly 87% of memory is being used on the emulator

but it's strange that the top shows the joplin app only takes 14% of memory.