basho / riak_dt

Convergent replicated datatypes in Erlang
Apache License 2.0
353 stars 70 forks source link

Make map/set interoperable with previous versions #111

Closed russelldb closed 9 years ago

russelldb commented 9 years ago

Addresses in part basho/riak#667.

The work from basho/riak_dt#110 was rushed and broken: all orddicts got changed to dicts: even the ones enternal to the map field structure. The to/from_binary calls did not touch or change them, so reading a previous set/map binary into a structure and operating on it was impossible.

Further more, nested sets and maps in maps were not converted by to/from_binary, so again, operating on nested data types was impossible.

I went with the ugly and direct method for adding a function head at all exposed API calls to catch the old orddict structure and transform it to the new dict.

This PR will let you read/work with old sets/maps, it does not address the issue of downgrading from dict style maps/sets to orddict ones.

andrewjstone commented 9 years ago

I think we should probably add a ?V2_VERS and make the supporting changes to to_binary/2 and from_binary/2. We would then have to make sure that to_binary/1 always uses the latest version, when calling to_binary/2.

andrewjstone commented 9 years ago

Other than my suggestions above this all looks good. Nice job @russelldb.

russelldb commented 9 years ago

I think the to/from_binary stuff if more complex and broken, and we need to bikeshed it. Really to write a v1 binary, you must recurse the entire structure. And to turn a v1->v2 requires the same. Maybe we drop that all together?

russelldb commented 9 years ago

@seancribbs is the happy cat like a +1? For @borshop ?

jonmeredith commented 9 years ago

@seancribbs are you good to plus one a sha for bors to do it's thang?

gordonguthrie commented 9 years ago

Sorry for being so prolix, but bear with.

Comments on riak_dt_map.erl

Struggling to review the test cases because I don't speak Quick Check (yet).

It appears that the test case:

...and does this an unspecified number of times with a distribution of sizes.

There doesn't appear to be a test case for the borked state - where a user has done the upgrade to 2.0.4 and has partially written data on disk. Will this release work on that?.

I don't even know how this clause compiles - I am thinking parse-transforms FTW: https://github.com/basho/riak_dt/blob/bug/rdb/gh-riak-667/src/riak_dt_map.erl#L1021

CRDT's are pure functions so all the upgrades should be done on the exported functions only. There are three private functions where the upgrades are done: https://github.com/basho/riak_dt/blob/bug/rdb/gh-riak-667/src/riak_dt_map.erl#L279 https://github.com/basho/riak_dt/blob/bug/rdb/gh-riak-667/src/riak_dt_map.erl#L481 https://github.com/basho/riak_dt/blob/bug/rdb/gh-riak-667/src/riak_dt_map.erl#L559

And a redundant helper function: https://github.com/basho/riak_dt/blob/bug/rdb/gh-riak-667/src/riak_dt_map.erl#L545

A -compile(export_all). attribute has been added - which could be removed to style: https://github.com/basho/riak_dt/blob/bug/rdb/gh-riak-667/src/riak_dt_map.erl#L175

Some stylistics comments/stuff unrelating to this change that I noticed:

Comments on riak_dt_orswot.erl

The function to_v2/1 will be called twice because the call in https://github.com/basho/riak_dt/blob/bug/rdb/gh-riak-667/src/riak_dt_orswot.erl#L414 is redundant

Same weird placing of -include(...). and -define(...). half way down a module https://github.com/basho/riak_dt/blob/bug/rdb/gh-riak-667/src/riak_dt_orswot.erl#L432

russelldb commented 9 years ago

There doesn't appear to be a test case for the borked state - where a user has done the upgrade to 2.0.4 and has partially written data on disk. Will this release work on that?.

The basho/riak_test#728 covers that. You're welcome to write a quickcheck case for it.

russelldb commented 9 years ago

Same weird placing of -include(...). and -define(...). half way down a module https://github.com/basho/riak_dt/blob/bug/rdb/gh-riak-667/src/riak_dt_orswot.erl#L432

But not part of this PR, right?? You just don't like it, or there is a reason for not liking it?

russelldb commented 9 years ago

@gordonguthrie Please only review the changes. Use the diff.

this https://github.com/basho/riak_dt/blob/bug/rdb/gh-riak-667/src/riak_dt_map.erl#L1021 works fine because it is erlang. Not a parse transform. And it is not part of the diff.

seancribbs commented 9 years ago

As Russell said, please comment on the change/diff only, but I will address some of your comments below.

There doesn't appear to be a test case for the borked state - where a user has done the upgrade to 2.0.4 and has partially written data on disk. Will this release work on that?.

If you perform this same test case with the 2.0.4 map, it should fail. It is a regression test.

I don't even know how this clause compiles

You don't need a list to do a comprehension, this is a common thing in our EQCs. If any clause in the RHS of the LC fails, the LHS does not generate.

CRDT's are pure functions so all the upgrades should be done on the exported functions only.

Not sure what you mean, but if you are trying to merge a V1 map with a V2 map, either the caller can ensure both are the correct version, or the implementation module can ensure that all incoming formats are upgraded to the version it prefers. We chose both for safety.

A -compile(export_all). attribute has been added - which could be removed to style:

This prevents us from needing to export things explicitly for eunit. It is unobtrusive.

russelldb commented 9 years ago

The function to_v2/1 will be called twice because the call in https://github.com/basho/riak_dt/blob/bug/rdb/gh-riak-667/src/riak_dt_orswot.erl#L414 is redundant

As discussed, this is kind of an optimisation. Both stats/1 and stat/2 are public, so need to accept v1 or v2 sets. Though stats/1 calls stat/2, if we turn the input to a v2 before the list comprehension we only do it once, rather than for each stat.

russelldb commented 9 years ago

Same weird placing of -include(...). and -define(...). half way down a module https://github.com/basho/riak_dt/blob/bug/rdb/gh-riak-667/src/riak_dt_orswot.erl#L432

Going to leave this as: stylistic (you like you way, I like mine), has precedent in OTP source, and outside the purview of this review. ok?

gordonguthrie commented 9 years ago

Re commenting on the diffs only, fair play...

Stylistics re include/define, again fair play there, not sure of house style

List comprehensions without lists, such a thing I never did see, but to my surprise they work in the shell.

stats/1 and stat/2 in ordsets - I was wrong, sub-optimal

russelldb commented 9 years ago

@gordonguthrie @seancribbs pushed 3f2b3ed to address review.

russelldb commented 9 years ago

@borshop merge