Closed rockhazard closed 3 years ago
I tried encryption and decryption without any issues. Plus we do not have any msg like "Context was already finalized."
Is there any other database file in the same location?
I tried encryption and decryption without any issues. Plus we do not have any msg like "Context was already finalized."
Is there any other database file in the same location?
I created the db using an import operation When I encrypted it, it left the original plus the encrypted version. When I tried to unlock it, it gave me the second error, so I removed the original unencrypted db. This left me with one db. Then, when I tried again, it let me try a password, but failed to unlock it, showing the third error, even when I tried to lock a fresh copy with a very simple 1 letter password.
I did do a search on the error you mentioned, and I got a hit on a java library issue with the same wording (but without hte leading [ERROR] tag). But that didn't shed any light, as the target directory only has buku-related files in it.
When I encrypted it, it left the original plus the encrypted version.
Can you share the exact commands you used so I can reproduce this? When I tried to reproduce it, after locking I had only the encrypted file.
I created the db using an import operation
Just wondering if you imported using sudo and later tried to encrypt as a normal user. As buku
would attempt to delete the original file it may fail in that case due to lack of permission,
I did not use sudo. However, I'm closing in on the problem. If I createa an empty db, encryption works without errors. As expected, when I add a single record it still works. However, if I import my Firefox json file the encryption stops working. My command sequenc is as follows:
I can then search the non-encrypted imported bookmarks, but I cannot open the enc db.
Thanks! I'll check this out.
Looks like the error was coming from the crypto module; https://stackoverflow.com/questions/59698196/cryptography-exceptions-alreadyfinalized-context-was-already-finalized
How many bookmarks did you have in the json? I had around 60 and i don't see this.
The first problem is we are not removing the partially encoded file when there's a failure. Same thing when we decrypt as well.
Please test commit 5d5b680405dd65548ba403d1baea32f0f5ec8bf6. With this when your encryption fails, only the db file will be left and the incomplete .enc file will be deleted.
We still need to check the problem with encryption.
@zmwangx please share if you have any idea on this. From google it looks like a padding related thing but not exactly sure.
Also adding @gyulaweber for any help with this error.
Looked at the crypto code, found bugs. To save you trouble I'll just open a PR.
To save you trouble I'll just open a PR.
Thank you!
Also have some qualms with your crypto itself (minus the bugs). It's kind of a hand-rolled authenticated (through writing the SHA-256 hash as part of the data) AES-CBC with a hand-rolled padding scheme (pad spaces and remove them through writing the file length as a struct !! as part of the data). Why not just use a mode with authentication built-in? Like AES-GCM, which doesn't even require padding? And for modes that require padding, there are standard padding schemes like PKCS7, which are implemented in cryptography.hazmat.primitives
.
Disclaimer: not a crypto expert, but did learn a few best practices.
Also have some qualms with your crypto itself
Just raise the PR! I feel you'll reduce some lines of code. ;)
Fixed at #481.
Unfortunately it's only gonna be fewer lines of code if you start anew. With legacy encrypted files out there, if you change the crypto (to be simpler and more robust) at the very least you'll have to provide a migration path with the old decryption code, so it wouldn't be simpler at all... Consider the above an unactionable rant.
This example demonstrates the importance the choosing the right application file format the first time round.
I don't think that's a big problem. We use the extension .enc
for the encoded files. So we can use something else for the new encryption algo and decrypt using earlier algo.
For a short stint at it, just add 2 new functions for the encrypt and decrypt logic. i will take care of the extension thing.. probably next weekend.
How many bookmarks did you have in the json? I had around 60 and i don't see this.
Well, buku removed the dupes, but it looks like the original backup has 5,464 urls.
Also, the original db file created after import is 588K, while the corrupted enc db is 513K, if that helps. I don't know if you're using compression on the enc db, but if not, that is a strange size change, no?
For a short stint at it, just add 2 new functions for the encrypt and decrypt logic.
Sure, here's a patch implementing a fresh crypto scheme with PBKDF2 + Fernet. See commit message for detailed explanations.
Patch (also in a gist: https://gist.github.com/7cbd65b2cffac771045cf9999a936383):
From 41987f1697992b48da13bbebc5e0c42036ea5dc5 Mon Sep 17 00:00:00 2001
From: Zhiming Wang <i@zhimingwang.org>
Date: Mon, 23 Nov 2020 01:01:19 +0800
Subject: [PATCH] Improve crypto
The less hand-rolled stuff the better.
- Switch from hand-rolled crappy KDF to PBKDF2;
- Switch from AES-CBC with hand-rolled padding and authentication to
Fernet;
- Future-proofing cipher text format (can use different PBKDF2
iterations, longer salt, or a new cipher altogether).
Rationale:
- PBKDF2 because it's the obvious choice; remove the option of
user-specified iterations because what's the point. 100k is good
enough. Just use a stronger password if they want better security.
- Fernet because it's been oftly recommended by experts, including the
authors of the very cryptography package we're using, and most
importantly, it has the minimum number of moving parts. It takes a
key, and the plaintext, and that's it. All the crypto stuff that can
go wrong (iv, tag, whatever hogwash) is implemented by experts, we
just can't do wrong.
- My output format borrows heavily from bcryptm, using $ as the
delimiter between algorithm version (future-proofing), decryption
parameters (iterations, salt) and cipher text.
Downsides:
- Fernet requires the entire cleartext and ciphertext in memory. I don't
think that's much of a practical concern given the nature of the
application at hand.
---
buku | 196 +++++++++++++++++++++--------------------------------------
1 file changed, 68 insertions(+), 128 deletions(-)
diff --git a/buku b/buku
index 9495cef..a51c6dc 100755
--- a/buku
+++ b/buku
@@ -31,6 +31,7 @@ import re
import shutil
import signal
import sqlite3
+from sqlite3.dbapi2 import DataError
import struct
import subprocess
from subprocess import Popen, PIPE, DEVNULL
@@ -110,38 +111,11 @@ class BukuCrypt:
"""
# Crypto constants
- BLOCKSIZE = 0x10000 # 64 KB blocks
- SALT_SIZE = 0x20
- CHUNKSIZE = 0x80000 # Read/write 512 KB chunks
+ PBKDF2_SALT_SIZE = 16 # 128 bit salt
+ PBKDF2_ITERATIONS = 100000
@staticmethod
- def get_filehash(filepath):
- """Get the SHA256 hash of a file.
-
- Parameters
- ----------
- filepath : str
- Path to the file.
-
- Returns
- -------
- hash : bytes
- Hash digest of file.
- """
-
- from hashlib import sha256
-
- with open(filepath, 'rb') as fp:
- hasher = sha256()
- buf = fp.read(BukuCrypt.BLOCKSIZE)
- while len(buf) > 0:
- hasher.update(buf)
- buf = fp.read(BukuCrypt.BLOCKSIZE)
-
- return hasher.digest()
-
- @staticmethod
- def encrypt_file(iterations, dbfile=None):
+ def encrypt_file(dbfile=None):
"""Encrypt the bookmarks database file.
Parameters
@@ -152,19 +126,16 @@ class BukuCrypt:
Custom database file path (including filename).
"""
+ import base64
+ from getpass import getpass
try:
- from cryptography.hazmat.backends import default_backend
- from cryptography.hazmat.primitives.ciphers import (Cipher, modes, algorithms)
- from getpass import getpass
- from hashlib import sha256
+ from cryptography.fernet import Fernet
+ from cryptography.hazmat.primitives import hashes
+ from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC
except ImportError:
LOGERR('cryptography lib(s) missing')
sys.exit(1)
- if iterations < 1:
- LOGERR('Iterations must be >= 1')
- sys.exit(1)
-
if not dbfile:
dbfile = os.path.join(BukuDb.get_default_dbdir(), 'bookmarks.db')
encfile = dbfile + '.enc'
@@ -191,81 +162,56 @@ class BukuCrypt:
LOGERR('Passwords do not match')
sys.exit(1)
- try:
- # Get SHA256 hash of DB file
- dbhash = BukuCrypt.get_filehash(dbfile)
- except Exception as e:
- LOGERR(e)
- sys.exit(1)
-
- # Generate random 256-bit salt and key
- salt = os.urandom(BukuCrypt.SALT_SIZE)
- key = ('%s%s' % (password, salt.decode('utf-8', 'replace'))).encode('utf-8')
- for _ in range(iterations):
- key = sha256(key).digest()
-
- iv = os.urandom(16)
- encryptor = Cipher(
- algorithms.AES(key),
- modes.CBC(iv),
- backend=default_backend()
- ).encryptor()
- filesize = os.path.getsize(dbfile)
+ salt = os.urandom(BukuCrypt.PBKDF2_SALT_SIZE)
+ kdf = PBKDF2HMAC(
+ algorithm=hashes.SHA256(),
+ length=32,
+ salt=salt,
+ iterations=BukuCrypt.PBKDF2_ITERATIONS,
+ )
+ key = base64.urlsafe_b64encode(kdf.derive(password.encode('utf-8')))
+ fernet = Fernet(key)
try:
with open(dbfile, 'rb') as infp, open(encfile, 'wb') as outfp:
- outfp.write(struct.pack('<Q', filesize))
- outfp.write(salt)
- outfp.write(iv)
-
- # Embed DB file hash in encrypted file
- outfp.write(dbhash)
-
- while True:
- chunk = infp.read(BukuCrypt.CHUNKSIZE)
- if len(chunk) == 0:
- break
- if len(chunk) % 16 != 0:
- chunk = b'%b%b' % (chunk, b' ' * (16 - len(chunk) % 16))
-
- outfp.write(encryptor.update(chunk))
-
- outfp.write(encryptor.finalize())
-
+ # Use $ as delimiter (taking a page from bcrypt).
+ outfp.write(b'$2') # algorithm version for future proofing
+ outfp.write(b'$%d' % BukuCrypt.PBKDF2_ITERATIONS)
+ outfp.write(b'$%s$' % base64.urlsafe_b64encode(salt))
+ outfp.write(fernet.encrypt(infp.read()))
os.remove(dbfile)
print('File encrypted')
sys.exit(0)
except Exception as e:
- os.remove(encfile)
+ try:
+ os.remove(encfile)
+ except OSError:
+ pass
LOGERR(e)
sys.exit(1)
@staticmethod
- def decrypt_file(iterations, dbfile=None):
+ def decrypt_file(dbfile=None):
"""Decrypt the bookmarks database file.
Parameters
----------
- iterations : int
- Number of iterations for key generation.
dbfile : str, optional
Custom database file path (including filename).
The '.enc' suffix must be omitted.
"""
+ import base64
+ from getpass import getpass
try:
- from cryptography.hazmat.backends import default_backend
- from cryptography.hazmat.primitives.ciphers import (Cipher, modes, algorithms)
+ from cryptography.fernet import Fernet, InvalidToken
+ from cryptography.hazmat.primitives import hashes
+ from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC
from getpass import getpass
- from hashlib import sha256
except ImportError:
LOGERR('cryptography lib(s) missing')
sys.exit(1)
- if iterations < 1:
- LOGERR('Decryption failed')
- sys.exit(1)
-
if not dbfile:
dbfile = os.path.join(BukuDb.get_default_dbdir(), 'bookmarks.db')
else:
@@ -294,50 +240,44 @@ class BukuCrypt:
try:
with open(encfile, 'rb') as infp:
- size = struct.unpack('<Q', infp.read(struct.calcsize('Q')))[0]
-
- # Read 256-bit salt and generate key
- salt = infp.read(32)
- key = ('%s%s' % (password, salt.decode('utf-8', 'replace'))).encode('utf-8')
- for _ in range(iterations):
- key = sha256(key).digest()
-
- iv = infp.read(16)
- decryptor = Cipher(
- algorithms.AES(key),
- modes.CBC(iv),
- backend=default_backend(),
- ).decryptor()
-
- # Get original DB file's SHA256 hash from encrypted file
- enchash = infp.read(32)
+ # Read algorithm version, PBKDF2 iterations and salt.
+ # Read until we have encountered four $'s.
+ buf = b''
+ delim_count = 0
+ while delim_count < 4:
+ ch = infp.read(1)
+ if len(ch) == 0:
+ raise DataError('malformed data')
+ buf += ch
+ if ch == b'$':
+ delim_count += 1
+ _, iterations_bytes, urlsafe_b64_salt = buf.strip(b'$').split(b'$')
+ iterations = int(iterations_bytes)
+ salt = base64.urlsafe_b64decode(urlsafe_b64_salt)
+
+ kdf = PBKDF2HMAC(
+ algorithm=hashes.SHA256(),
+ length=32,
+ salt=salt,
+ iterations=iterations,
+ )
+ key = base64.urlsafe_b64encode(kdf.derive(password.encode('utf-8')))
+ fernet = Fernet(key)
+ decrypted = fernet.decrypt(infp.read())
with open(dbfile, 'wb') as outfp:
- while True:
- chunk = infp.read(BukuCrypt.CHUNKSIZE)
- if len(chunk) == 0:
- break
-
- outfp.write(decryptor.update(chunk) + decryptor.finalize())
-
- outfp.truncate(size)
-
- # Match hash of generated file with that of original DB file
- dbhash = BukuCrypt.get_filehash(dbfile)
- if dbhash != enchash:
+ outfp.write(decrypted)
+ os.remove(encfile)
+ print('File decrypted')
+ except Exception as e:
+ try:
os.remove(dbfile)
+ except OSError:
+ pass
+ if isinstance(e, InvalidToken):
LOGERR('Decryption failed')
- sys.exit(1)
else:
- os.remove(encfile)
- print('File decrypted')
- except struct.error:
- os.remove(dbfile)
- LOGERR('Tainted file')
- sys.exit(1)
- except Exception as e:
- os.remove(dbfile)
- LOGERR(e)
+ LOGERR(e)
sys.exit(1)
@@ -5084,9 +5024,9 @@ POSITIONAL ARGUMENTS:
# Handle encrypt/decrypt options at top priority
if args.lock is not None:
- BukuCrypt.encrypt_file(args.lock)
+ BukuCrypt.encrypt_file()
elif args.unlock is not None:
- BukuCrypt.decrypt_file(args.unlock)
+ BukuCrypt.decrypt_file()
# Set up title
if args.title is not None:
--
2.25.1
@rockhazard your issue should be fixed on master. Please test and confirm.
@zmwangx Thank you!
@jarun So, the problem persists with my original bookmark file. However, I deduped the bookmarks and removed a bunch of cruft built up from about two years of using Tab Session Manager, a plugin that I think saves sessions to the bookmarks (a thing I didn't know and which is causing me to re-evaluate my usage of said plugin). After I did that, the file moved from north of 5k urls down to about 1700. Once I did that, it seems that both the pypi version and the latest master work now (don't know about how you deploy, so I don't know if there is a difference). So, I can only guess that Tab Session Manager (or the number of bookmarks) had something to do with it.
So you are saying even with the fix from @zmwangx the issue persists? Please ensure you had the patch when you tested.
The crypto routine doesn’t care about whether the file is a SQLite database, or a video, or anything. You should be able to put any file there and it either works, or not. Content just doesn’t play a role here, only the byte count.
@jarun Btw, may I suggest that you dump a traceback for the last exception in debug mode, like what googler does? Right now you only print the exception message (not even the exception type), and are left guessing where it comes from, which is especially hard when the exception comes from anywhere in your dependency chain, and/or when the exception message isn't clear (sometimes there may not even be a message, like cryptography.fernet.InvalidToken
).
Since you seem to use sys.exit(1)
in quite a few places instead of centralizing the reporting, and LOGERR
may not be exclusively used with exceptions, let alone fatal ones, I think the following trick would probably introduce the minimum amount of change while achieving the goal:
diff --git a/buku b/buku
index 9495cef..baf7181 100755
--- a/buku
+++ b/buku
@@ -5566,4 +5566,14 @@ POSITIONAL ARGUMENTS:
bdb.close_quit(0)
if __name__ == '__main__':
- main()
+ try:
+ main()
+ except SystemExit as e:
+ # If we exited via sys.exit(1), and if we handled an exception before
+ # that, we want to get the traceback of that exception in debug mode,
+ # which would be conveniently stored in the __context__ field of the
+ # current exception.
+ if e.code != 0 and e.__context__ is not None and LOGGER.isEnabledFor(logging.DEBUG):
+ cause = e.__context__
+ import traceback
+ traceback.print_exception(type(cause), cause, cause.__traceback__)
This way, bug reports could look like this:
$ buku -l --debug
[DEBUG] buku v4.4
[DEBUG] Python v3.8.5
Password:
Password:
[ERROR] Context was already finalized.
Traceback (most recent call last):
File "buku", line 231, in encrypt_file
outfp.write(encryptor.update(chunk) + encryptor.finalize())
File "/tmp/buku/venv/lib/python3.8/site-packages/cryptography/hazmat/primitives/ciphers/base.py", line 153, in update
raise AlreadyFinalized("Context was already finalized.")
cryptography.exceptions.AlreadyFinalized: Context was already finalized.
Please raise a PR. I am real short of bandwidth.
@jarun @zmwangx
After the request to make sure I had the latest files, I redownloaded the git master branch into a clean mint 19.3 vm. I installed it using sudo make install
, changed buku to executable, imported the problem file with buku -i bookmarks.html
(a Firefox backup) then ran the following commands:
$ ./buku -l
Password:
Password:
File encrypted
$ ./buku -k
Password:
[ERROR] Context was already finalized.
$ ls ~/.local/share/buku/
bookmarks.db.enc
As you can see, the file still won't decrypt. I just used a one-character password, so mistyping wasn't the problem. Unlike in the previous attempts (before my last reply), the orgiinal bookmarks.db was deleted and only an ecn file remained, rather than leaving the original db in addition to the new (corrupted) enc as it had. I then removed the enc file and imported the cleaned bookmarks file. Then I encrypted and decrypted the cleaned bookmarks file with the same password successfully.
If I missunderstood something, please let me know.
@rockhazard would it be possible to share the db or json file if it doesn't have confidential data? No need to share it here. Probably over mail.
I simply didn't bother to test the decryption, which has the exact same problem. Trivial to fix.
@rockhazard everything should be working for you on master, thanks to @zmwangx!
@jarun @zmwangx Thank you both for your diligence. The latest update worked. Unfortunately, even though I'm curious what in the file may have contributed to the issue, I can't share the original bookmarks file, as it does contain confidential info.
@rockhazard
Thanks for confirming!
Unfortunately, even though I'm curious what in the file may have contributed to the issue,
It's not required. We figured out the issue. Thanks again!
When I try to encrypt my buku DB with
buku -l
I get this error:[ERROR] Context was already finalized.
Later, if I attempt to unlock it withbuku -k
I get this error:[ERROR] Both encrypted and flat DB files exist!
So I remove the non-encrypted DB and repeat the unlock, which gives me this when using the correct password:[ERROR] Decryption failed
The error repeats as above with both default and custom iterations.
System: cpu: Intel 8700 ram: 32GB os: Linux Mint 19.3 shell: zsh 5.4.2 (x86_64-ubuntu-linux-gnu) Python 2.7.17 & Python 3.6.9 buku v4.4