apache / fury

A blazingly fast multi-language serialization framework powered by JIT and zero-copy.
https://fury.apache.org/
Apache License 2.0
3.11k stars 248 forks source link

feat(python): Implement collection serialization protocol #1942

Closed penguin-wwy closed 4 days ago

penguin-wwy commented 1 week ago

What does this PR do?

Implement a new format for collection serialization in pyfury.

Related issues

Does this PR introduce any user-facing change?

Benchmark

fury_tuple: Mean +- std dev: [base] 259 us +- 6 us -> [collection] 256 us +- 5 us: 1.01x faster
fury_large_tuple: Mean +- std dev: [base] 92.7 ms +- 5.5 ms -> [collection] 63.7 ms +- 4.8 ms: 1.46x faster
fury_list: Mean +- std dev: [base] 277 us +- 6 us -> [collection] 267 us +- 3 us: 1.04x faster
fury_large_list: Mean +- std dev: [base] 92.8 ms +- 5.3 ms -> [collection] 66.5 ms +- 3.0 ms: 1.40x faster

Geometric mean: 1.21x faster
chaokunyang commented 1 week ago

Hi @penguin-wwy , thanks for your hard work. This new protocol should be faster, you did't implement the new protocol fully. I left some comments, could you take a look at it?

penguin-wwy commented 1 week ago

Benchmark data has been updated.