Scille / parsec-cloud

Open source Dropbox-like file sharing with full client encryption !
https://parsec.cloud
Other
268 stars 40 forks source link

Increase msgpack MAX_BIN_LEN (PARSEC v2 only) #6473

Open vxgmichel opened 6 months ago

vxgmichel commented 6 months ago

This limit is too low for files above ~6.5 GB, as the manifest would exceed 1MB:

https://github.com/Scille/parsec-cloud/blob/91c86cc49bc3d9ae488f7bac1ae69c6bb847e492/parsec/serde/packing.py#L18

This is only an issue for v2 as rmp_serde doesn't have such limitations.

A possible fix for v2 would simply be to increase it to 8MB, or prevent the creation of large files as they're not well supported at the moment (see #6474).

vxgmichel commented 3 months ago

For reference the typical error in this case is:

2024-04-23T07:59:51.082159Z [error    ] Invalid request data according to backend 
cmd=vlob_update 
conn_id=daa0d19b391e4b8fbe44a47249b442fe 
device_id=DeviceID("3330806ff09f47aaa75491d6cf4ba759@b323e771cbf940e4be55e74fcb1ff532") 
rep=UnknownStatus { unknown_status: "invalid_msg_format", reason: Some("Invalid message format") }
vxgmichel commented 3 months ago

:warning: Once this problem occurs, other files won't synchronize properly. Even worse, the problem persists even after removing the large file for the workspace. In order to get back to a valid state, the follwing script may be used:


from __future__ import annotations
import os
from glob import glob
from sqlite3 import connect
from pathlib import Path
from argparse import ArgumentParser

# This limit has been mesured by creating a 6GB file using:
# - truncate -s 6GB large
# Then measuring the size of the corresponding blob in the local database
DEFAULT_SIZE_LIMIT = 3807940

def get_parsec_data_dir() -> Path:
    if os.name == "nt":
        appdata = os.getenv("APPDATA")
        assert appdata is not None
        return Path(appdata) / "parsec" / "data"
    else:
        data_home = os.getenv("XDG_DATA_HOME")
        assert data_home is not None
        return Path(data_home) / "parsec"

def main(path: Path | None, size: int) -> None:
    base_dir = path or get_parsec_data_dir()
    db_files = glob(f"{base_dir}/**/workspace_data-v1.sqlite", recursive=True)
    for db_file in db_files:
        print(f"Fixing {db_file}...")
        with connect(db_file) as conn:
            conn.execute("DELETE FROM vlobs WHERE length(blob) >= ?", (size,))
            changes, = conn.execute("SELECT changes()").fetchone()
            if not changes:
                print("> No changes were made")
                continue
            print(f"> {changes} vlobs were deleted")

if __name__ == "__main__":
    parser = ArgumentParser("Fix the local parsec workspace databases by removing vlobs that are too large")
    parser.add_argument("--path", type=Path, help="Path to the directory containing the SQLite files", default=None)
    parser.add_argument("--size", type=int, help="Maximum size of the blob in bytes", default=DEFAULT_SIZE_LIMIT)
    args = parser.parse_args()
    main(args.path, args.size)