awslabs / dynein

DynamoDB CLI written in Rust.
https://github.com/awslabs/dynein
Apache License 2.0
360 stars 37 forks source link

refactor: set bootstrap data locally instead of downloading #153

Closed wafuwafu13 closed 11 months ago

wafuwafu13 commented 1 year ago

Issue #, if available:

149, #143

Description of changes:

https://github.com/awslabs/dynein/issues/149#issue-1848732854

Downloading json files each time is redundant and may cause errors in the unzip process.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

wafuwafu13 commented 1 year ago

discuss with @StoneDot

dy should work in single binary so using like https://doc.rust-lang.org/std/macro.include_str.html or https://doc.rust-lang.org/std/macro.include_bytes.html is better.

cargo install --locked --path . | ./target/release/dy bootstrap is success but the following error occurs.

~/desktop/dynein
$ cp ./target/release/dy ~/desktop/

~/desktop
$ ./dy bootstrap                   
Bootstrapping - dynein will creates 4 sample tables defined here:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/AppendixSampleTables.html

'ProductCatalog' - simple primary key table
    Id (N)

'Forum' - simple primary key table
    Name (S)

'Thread' - composite primary key table
    ForumName (S)
    Subject (S)

'Reply' - composite primary key table, with GSI named 'PostedBy-Message-Index'
    Id (S)
    ReplyDateTime (S)

[skip] Table 'ProductCatalog' already exists, skipping to create new one.
[skip] Table 'Forum' already exists, skipping to create new one.
[skip] Table 'Thread' already exists, skipping to create new one.
[skip] Table 'Reply' already exists, skipping to create new one.
Still CREATING following tables: []
All tables are in ACTIVE.
Tables are ready and retrieved sample data locally. Now start writing data into samle tables...
Error: LoadData(Os { code: 2, kind: NotFound, message: "No such file or directory" })
StoneDot commented 1 year ago

It seems that bzip2 achieves best compression ratio. brotli and lzma also achieve good compression rate. Based on the maturity of the library and the compression rate, I think rust-brotli is a best choice. Please feel free to comment your opinion.

❯ lzma -k -e moviedata.json
❯ bzip2 -9 -k moviedata.json
❯ brotli -k moviedata.json
❯ gzip -9 -k moviedata.json
❯ zstd -19 -k moviedata.json
❯ lz4 -9 -k moviedata.json
❯ ls -l | grep moviedata
.rw-r--r-- 3.7M hiroag 25 8 10:43 moviedata.json
.rw-r--r-- 480k hiroag 25 8 10:43 moviedata.json.br
.rw-r--r-- 423k hiroag 25 8 10:43 moviedata.json.bz2
.rw-r--r-- 731k hiroag 25 8 10:43 moviedata.json.gz
.rw-r--r-- 823k hiroag 25 8 10:43 moviedata.json.lz4
.rw-r--r-- 486k hiroag 25 8 10:43 moviedata.json.lzma
.rw-r--r-- 504k hiroag 25 8 10:43 moviedata.json.zst
wafuwafu13 commented 1 year ago
$ uname -a
Darwin c889f3a95659 22.6.0 Darwin Kernel Version 22.6.0: Wed Jul  5 22:22:05 PDT 2023; root:xnu-8796.141.3~6/RELEASE_ARM64_T6000 arm64

$ cargo build -j 8 --release

$ ls -lh ./target/release/dy
-rwxr-xr-x  1 herotaka  staff    11M 31 Aug 10:24 ./target/release/dy
wafuwafu13 commented 1 year ago

Pros

Cons

StoneDot commented 1 year ago

I confirmed that the difference in the size of the binary is negligible.

❯ git show --no-patch

commit 1264580374ba26c60d79068eff33569838d9b655 (HEAD -> bootstrap-local)
Author: Hirotaka Tagawa <herotaka@amazon.com>
Date:   Tue Aug 29 22:43:44 2023 +0100

    refactor: decompress by brotli

❯ ls -lh ./target/release/dy
Permissions Size User   Date Modified Name
.rwxr-xr-x   12M hiroag  8 9 13:47    ./target/release/dy