basho / riak_dt

Convergent replicated datatypes in Erlang
Apache License 2.0
353 stars 70 forks source link

To / from binary for map and set #62

Closed russelldb closed 11 years ago

russelldb commented 11 years ago

The eqc tests collect output of the format

{{size range}, {percent saving rang}}

Where size is {from, to} eg from set it is the cardinality, so {10, 20} means a set with 10-20 elements. For map size is the term_to_binary size div 10.

Percent saving of {20, 30} means that the to_binary/1 output is 20-30% smaller than t2b for the same data.

Only when the data type is empty is t2b better (as far as I can tell.)

seancribbs commented 11 years ago

Review TODOs:

seancribbs commented 11 years ago

Some dialyzer warnings:

riak_dt_map.erl:322: The call riak_dt_vclock:to_binary(Clock::riak_dt_vclock:vclock()) contains an opaque term as 1st argument when a structured term of type [any()] is expected
riak_dt_orswot.erl:264: Invalid type specification for function riak_dt_orswot:to_binary/1. The success typing is ({[any()],[{_,riak_dt_vclock:vclock()}]}) -> <<_:8,_:_*1>>
riak_dt_orswot.erl:285: The call riak_dt_vclock:to_binary(Clock::riak_dt_vclock:vclock()) contains an opaque term as 1st argument when a structured term of type [any()] is expected
riak_dt_orswot.erl:295: Invalid type specification for function riak_dt_orswot:from_binary/1. The success typing is (<<_:8,_:_*1>>) -> {[any()],[{_,_}]}

I think all of these can be resolved by adding specs for riak_dt_vclock:to_binary/1 and riak_dt_vclock:from_binary/1. The main issue is that because vclock() is an opaque type, dialyzer can't determine that [any()] (the inferred type) actually matches it.

seancribbs commented 11 years ago

In general this is great, nice space savings for larger data structures.

I think we can do better on documentation, but maybe that deserves an issue to itself? Also it would be nice to see that roundtrip applied to all types.

Ping again when you have those dialyzer warnings squelched and I'll give it another pass and the thumbs-up.

russelldb commented 11 years ago

Thanks @seancribbs. I'll get it cleaned up UK morning.

lenary commented 11 years ago

@seancribbs I want to get my test back in the running at some stage, which also does a full roundtrip. Let's just see how we go.

seancribbs commented 11 years ago

Russell's prop does round trips and measures the space savings. We just need to implement the callbacks in each module. pre4 stuff IMHO.

Sean Cribbs

On Oct 1, 2013, at 4:38 PM, Sam Elliott notifications@github.com wrote:

@seancribbs I want to get my test back in the running at some stage, which also does a full roundtrip. Let's just see how we go.

— Reply to this email directly or view it on GitHub.

lenary commented 11 years ago

@seancribbs yeah, that comment wasn't "block this PR until i get mine back in the running", it was "i should do this sometime, in a separate PR"

russelldb commented 11 years ago

ping @seancribbs addressed comments and made dialyzer happy. Also noticed that the actors/1 fun in riak_dt_vclcok was duplicate of the already present all_nodes/1 fun, so removed it.

seancribbs commented 11 years ago

Still one lingering dialyzer warning:

Unknown types:
  riak_dt_vclock:actor/0

Here's where that occurs:

src/riak_dt_map.erl
97:-type actor_map() :: [{riak_dt_vclock:actor(), pos_integer()}
98:                      | {pos_integer(), riak_dt_vclock:actor()}].
seancribbs commented 11 years ago

Once that type warning is fixed, :+1: to merge.

russelldb commented 11 years ago

Thanks, I made issue #63 to track adding the roundtrip test to other types.