WiseLibs / better-sqlite3

The fastest and simplest library for SQLite3 in Node.js.
MIT License
5.52k stars 394 forks source link

Recommended solutions for replication, synchronization and encryption? #616

Open fabiospampinato opened 3 years ago

fabiospampinato commented 3 years ago

This is probably somewhat off-topic, but I've been using better-sqlite3 recently and my experience with it has just been flawless and I like the exposed APIs a lot, I'm evaluating using this library also for keeping some local state for each user of an app of mine, but eventually I'll want to synchronize that local database with one in the cloud in an end-to-end encrypted fashion and I'd rather not try to handle the details of robust synchronization and encryption myself as that would be super error-prone, I'd like to pick the right tools for the job from the get go rather than eventually migrating to a completely different solution.

So the question is: is there a recommended set of tools that could be used in conjunction with better-sqlite3 to achieve synchronization and end-to-end encryption with a database in the cloud?

I think my use case could be extremely well suited for a database like SQLite3 in general, but I'm having some difficulties finding good tools that would allow me to scale beyond just a bunch of local databases.

JoshuaWise commented 3 years ago

Encryption is easily solved by using better-sqlite3-sqleet, which is a custom build of better-sqlite3 that uses sqleet for encrypting the database.

Replication is a bit trickier to solve...

If your database is small, you could use db.serialize() to send the entire database over the wire periodically.

If your database is larger, you could use db.backup() to create a copy of the database locally and then stream that file to the cloud after each backup.

The problem with both of the above approaches is that they generate entire snapshots of the database, which is inefficient if you just want to synchronize changes. Also, it would be impractical to backup the entire database after every transaction, so if the application crashes, some of the recent local changes might be lost.

The best solution would probably to use the SQLite's Session Extension. But that's not currently supported by better-sqlite3.

fabiospampinato commented 3 years ago

Encryption is easily solved by using better-sqlite3-sqleet, which is a custom build of better-sqlite3 that uses sqleet for encrypting the database.

I saw better-sqlite3-sqleet before, but the problem I have with that is that the underlying library hasn't received any commits in more than a year, what's the probability that it's bug free really? It seems much more plausible to me that just not enough people are using that to discover the bugs, or the maintainer kind of abandoned it. Also better-sqlite3-sqleet itself hasn't been updated in the past 4 months, while here you shipped 3 releases in the past few days, whatever features and fixes you are shipping I don't want to miss them, but neither I would want to maintain that library by myself, ideally.

If your database is larger, you could use db.backup() to create a copy of the database locally and then stream that file to the cloud after each backup.

Unfortunately something like may work ok-ish some of the time but eventually it'll break down, plus it's really super wasteful of resources, when asking the server for changes I can't have it send to me the entire thing just for me to discover that I didn't actually miss out on any changes. Plus you have fundamental issues where multiple clients made some changes at the same time, or they happened to be offline for a while, and there's just no way to merge the two database dumps with just that.

The best solution would probably to use the SQLite's Session Extension. But that's not currently supported by better-sqlite3.

I hadn't seen that before, but IMHO that can't even work well enough, like it's better than nothing I guess, but the documentation page doesn't even mention "timestamp" or "clock" once, without some form of timekeeping it's impossible to know in which order changes should be applied, like if Alice deletes a text file and Bob later on (without knowing yet about what Alice did earlier) replaces the content of that text with something else you can't just have Alice's deletion override Bob's modification, but how do you know what happened last without some form of timekeeping? The timestamp could be conflicting too but that's a much rarer issue I think. Overall I don't think the session extension was written with robust synchronization in mind.


Of the two encryption is probably the easier to solve, I could take a leap of faith with sqleet or even put only encrypted content manually into the database that will be synced, but for the synchronization itself I can't really find any promising solution.

JoshuaWise commented 3 years ago

I hadn't seen that before, but IMHO that can't even work well enough, like it's better than nothing I guess, but the documentation page doesn't even mention "timestamp" or "clock" once, without some form of timekeeping it's impossible to know in which order changes should be applied, like if Alice deletes a text file and Bob later on (without knowing yet about what Alice did earlier) replaces the content of that text with something else you can't just have Alice's deletion override Bob's modification, but how do you know what happened last without some form of timekeeping? The timestamp could be conflicting too but that's a much rarer issue I think. Overall I don't think the session extension was written with robust synchronization in mind.

The Session Extension does have a way of resolving conflicts, but it's all application-defined. You could theoretically ensure that each table has a last_modified timestamp (maybe update it automatically using a trigger), and leverage that timestamp or whatever other mechanism you need to implement the conflict resolution logic. It would be no small feat, for sure, but it does seem possible.

ghost commented 3 years ago

I just wanted to add that for replication the best solution is definitely Litestream. It's an amazing new project and there is a lot more functionality planned for it like clusters and events.

m4heshd commented 3 years ago

Just wanted everyone to know that I'm maintaining a fork of better-sqlite3 named better-sqlite3-multiple-ciphers that supports really stable, up-to-date encryption.

GitMurf commented 1 month ago

Just wondering if there is any new word / update on the question of better SQLite encryption? We are switching from realmdb to SQLite due to mongo deprecating its sync service and divesting Realm back to open source. But we need an encrypted option. Love better SQLite 3!

mceachen commented 1 month ago

@GitMurf see the comment immediately preceding yours.

GitMurf commented 1 month ago

@GitMurf see the comment immediately preceding yours.

@mceachen Thanks. Is the idea that it will stay in-synch with the core better-SQLite? I really value the maintainability, trust and dedication historically of the better SQLite project and want to make sure any forks are aligned similarly :) I'm not expecting you can answer that absolutely but if you know, would love to hear your opinions / thoughts. Thanks!

mceachen commented 1 month ago

Open source projects are always a gamble: you’re relying on the goodwill of some random engineer to stay engaged with their project.

For what it’s worth, @m4heshd is an active contributor to this project, and has kept his fork up to date for a while now.