Open queicherius opened 2 months ago
SELECT
performance seems goodINSERT
performance seems goodUPDATE/DELETE
performance seems OK, from what I can tell it only touches the data that is necessary even when that data is compressedSELECT
performance seems goodINSERT
performance seems goodUPDATE/DELETE
performance seems bad since it "[...] forces all data parts containing those rows to be deleted to be re-written, with the target rows excluded when forming the new part" [which for us is ALL parts if someone requests their data to be migrated/deleted]. DELETE
s can be done in the background and there seem to be work arounds, but IDK if I can be bothered.SELECT
performance seems maybe OK? I can't imagine how it's faster than the others, because it reads files via HTTPFS vs the others which are processes that run on the server that also stores the data.INSERT
performance seems OK, but same thing as SELECT
UPDATE/DELETE
performance seems really bad since we "need to save the full file back with the updates made" AND via HTTPFS which is even slower than local files
Some of the things we do to get MongoDB to be performant are getting annoying (slow updates, having to run 3 collections instead of 1, etc), and are still causing issues (e.g. #1799). It might be wise to invest the time into switching into a better database instead of trying to keep fixing things.
I really want something PostgreSQL based, because I prefer that at work. Backups would work via pgBackRest (defacto standard). Current top contender is Timescale, if it worked for Plausible it should do just fine for us.
The (small) downside would be that we would have to run schema migrations when we add new statistics.
We should be able to get started trying out things by just dumping the entire 3TB MongoDB collection into the table locally.