basho / riak-php-client

PHP clients for Riak
Apache License 2.0
160 stars 69 forks source link

2.0.x Kv Object instantiation [JIRA: CLIENTS-285] #87

Closed christophermancini closed 9 years ago

christophermancini commented 9 years ago

Hey folks, I am trying to decide which is more ideal when creating a new object. Currently the way I have the 2.0.x code, here is how you create a new object:

$object = new Basho\Riak\Object($key, $data, $headers);

https://github.com/basho/riak-php-client/blob/2.0.x/src/Riak/Object.php#L46

There is a setter for the Bucket object, but typically you won't need the full Bucket object unless you are working with the bucket properties.

Which seems better to you:

  1. The $key can contain the full human readable location for the key in a pattern like {$type}\{$bucket}\{$key}, e.g. default\users\3215617. Then the constructor can automatically parse the location for the bucket type and name.
  2. I add a 4th parameter that accepts the Bucket object or a human readable namespace for the bucket.
lukebakken commented 9 years ago

For the C# client, we intend to move away from using strings to represent an object's identifier since they aren't any better than void * and are subject to typos that are difficult to debug. Each part of an object's identifier (bucket, key, bucket type (optional)) will be an immutable class that becomes part of the object's ID. The java client is doing something similar I believe. My $0.02.

In your example, the use of \ to represent parts of an object's identifier should be forward-slashes to match what is used via the HTTP interface, unless it's a well-known PHP standard to use backslashes in this situation. I see that backslashes are used to namespace modules.

christophermancini commented 9 years ago

Yes, backslashes are used for PHP namespaces, so I think initially it made sense but I think its probably better if I change it to match the Riak location string. If I went the string pattern route as an interface option, then I would parse the locator string into objects in the backend.

I completely understand your position as string locator's can be error prone. App dev's could extend the Bucket object and abstract away the creation / fetching of a new object by adding that functionality within their Bucket object to mitigate the chance for errors. They could have a Bucket object for each bucket that they have in Riak. Which would allow them to wrap the Command to create the object as well. Like:

// Custom bucket class
$user = Users::CreateUser();

I just keep coming back to the notion that I don't really need a full Bucket object to be instantiated to make a request for a Kv Object or CRDT, I just need the bucket name and optionally the type. So, it just seems like extra work to make the dev instantiate a Bucket object and pass it in on construct or call a setBucket post construct when you are doing a quick fetch / store.

Here is maybe a more visual representation of what I am discussing.

// current: use setter
$bucket = new Bucket('users', 'default');
$object = new Object('3215617', $data);
$object->setBucket($bucket);

// option 1: support full object locator string and parse it into the Bucket object on construct
$object = new Object('/default/users/3215617');

// option 2: add bucket object to object constructor
$bucket = new Bucket('users', 'default');
$object = new Object('3215617', $bucket, $data);

// option 3: add path string to object constructor
$object = new Object('3215617', '/default/users', $data);
debo commented 9 years ago

Is there any particular reason why the object should be aware of the bucket it belongs to? In my head it would be more logical to treat the bucket as a collection of object so having something like:

$usersBucket = new Bucket('users', 'default');
$userObject = new Object('3215617', $data);
$usersBucket->addObject($userObject);

For sure you don't want to use setters but use DI as much as you can. Also, I noticed that you have the tendency to use setters also internally in your class, in these cases it's better to directly access the property rather than pass by the mutator unless you have specific logic to apply to the value you are passing. Still in that case there are better pattern to apply to such use case.

christophermancini commented 9 years ago

So, I figured that the Object should be aware of where it came from so that if it comes into another section of code, it can be saved back to the same location. However, I chatted with my team lead earlier and we discussed this in depth. Essentially, he felt that since PHP is single threaded, there in theory shouldn't be an important reason for the object to know where it came from on Riak.

Also, I am going to re-work the bucket class to be a namespace class instead and you pass the Object and the Namespace to the Command builder when preparing a Riak operation, which to be honest was the approach my boss and I had discussed back in October but at some point veered away from it without realizing it.

We don't want to fall into the same mess that was the legacy client by having a super bucket object that controls functionality. The bucket in reality is a namespace that has operational rules that can vary from its neighbors, not a data structure or a quarterback for operational tasks.

Thanks for challenging my code!!! You are right, I went getter / setter happy with the IDE. This client is the first project I am working on from scratch since switching over to PHPStorm and I totally didn't think about the fact that I was destroying encapsulation.

Let me get these changes done & committed and then I will post a new example of the interface.

debo commented 9 years ago

Setters can have their benefit, I'm not completely against them if there is reason behind them. What I didn't like it's the fact that you used them internally. Also be careful with traits, they seem nice and powerful but they can easily become the new "abstract" and increase the rigidity of your code.

As per the buckets I wasn't suggesting to have a massive object that handles all the operations rather a collection because that's what a bucket in the end of the days is right?

About the fact PHP is single threaded it's correct however you will still have racing conditions and parallel access to the same resources so the object might want to know the bucket it belongs to. You could achieve that following my example with an observer pattern though avoiding to have the bucket as object dependency.

christophermancini commented 9 years ago

W.r.t. encapsulation (getters / setters), I think that I need to reduce my use of setters in situations where it makes sense, to prevent bad code from mutilating the object and saving it back. This article shows a good example of this, http://cspray.github.io/2012/05/13/stop-calling-them-getters-setters.html.

Though, I am not sure I see the downsides of using them internally to the class as if we decided to add some type of trigger or validation on set, we could easily do this without having to make that business rule in several places. E.g. if we wanted to verify that the headers being set on an object are valid headers, like Content-type.

W.r.t traits, I wasn't trying to go crazy using them, but I couldn't escape the notion that Objects and DataTypes share several common characteristics and needs. Hence why I tried to improve the code reuse a bit without having DataTypes extend the Object class.

I think awareness of its stored location on Riak can be achieved via the meta information that comes back in the response headers from Riak and travels with the object. I can make a getter for Namespace or location that retrieves it out of the stored headers. I think this will help resolve my concerns of having to spin up a Bucket object even though I just need its Namespace. Though, I may end up borrowing the Location & Namespace objects of the Java class.

I will look into the Observer pattern, not familiar with it.

debo commented 9 years ago

w.r.t. accessors/mutators the downside it's more related to two factors, the minor one, performance, although negligible setters are slower than direct access, the major one, although easier, validation should be performed using validator, filters and type hinting when needed.

w.r.t. traits oh yes I agree, traits are definitely better than inheritance, it was just a memo because I've been there and done that, I basically replaced abstracts with traits and I still regret it :)

w.r.t. bucket, I still think we need a sort of container, if anything, just to have a logic place where to "store" our object, maybe we could use the SplObjectStorage; I find the doctrine collection neater but that will require adding a dependency and I think that we might want to avoid that for the time being.

w.r.t. patterns you can find the major one implemented in PHP here surely you will find some useful ones. Read the book from the gang of four if you never did, it's an eye opener.

christophermancini commented 9 years ago

Great resource, thanks for sharing! I will dig in there for sure.

Good call regarding validation, I forgot I validation baked into the Command object which should be sufficient in making sure that the data is in proper order. Plus, each command should be aware of what it needs in order to execute successfully.

So, your mention of the observer pattern, was your thoughts that during run time, the Bucket object can maintain knowledge of the Kv objects it contains?

With dependencies, I want to do my best to keep them as minimal as possible but make use of them where it makes sense, e.g. Protocol Buffers. I am modeling the API layer so that devs could bring their own classes for communicating with Riak which is why I just kept the HTTP API class that will ship with the lib simple and just uses CURL versus using a lib like Guzzle. If someone wants to write their own Guzzle Adapter, they are free to do so without hacking the lib to use it.

W.r.t. could I not at a later time write an ODM for Riak like the Mongo ODM? This way, this lib is a dependency of that versus other way around. I intend to write service providers for many popular frameworks / projects to increase Riak's visibility and accessibility after this rewrite.

debo commented 9 years ago

Regarding the observer more like, if needed, that the bucket can notify the object of the relationship, as in, when you add an object to the bucket the bucket tells to the object: "Hey, I'm your bucket". It might not be needed though and the bucket will end up just being an object storage.

For the dependencies the best approach is that they are all going to be injected so a dependency injection pattern would be recommended. Likely, I would recommend to use the Symfony one because it's quite wide spread.

w.r.t. I wasn't referring to write a doctrine odm rather suggesting to use and extend the doctrine ArrayCollection component in the bucket. But as I said better use the native SplObjectStorage to don't have external dependencies.

christophermancini commented 9 years ago

@MarcoDeBortoli, I tweaked a ton and I feel pretty good about my changes. Here is an example of using the lib.

use Basho\Riak;
use Basho\Riak\Node;
use Basho\Riak\Command;

$nodes = (new Node\Builder)
    ->withPort(10018)
    ->buildCluster(['riak1.company.com', 'riak2.company.com', 'riak3.company.com',]);

$riak = new Riak($nodes);

$command = (new Command\Builder(Command::STORE_OBJECT))
    ->addObject('test_data')
    ->withLocation('/default/users/some_key')
    ->build();

$result = $riak->execute($command);

Let me know your thoughts.

debo commented 9 years ago

@christophermancini sorry for the late reply, I had a busy week and little time to get back to you earlier. I stil have to check the code but it looks quite need from the interface point of view. I would like it even more readable maybe, something like:

$command = (new Command\Builder(Command::STORE_OBJECT))
    ->buildObject('test_data')
    ->withLocation('/default/users/some_key')

Does the node builder actually build something or is it just configuration? If the latter than maybe using a builder is not the best choice.

I'll have a look at the code during the week and maybe push some PR

christophermancini commented 9 years ago

I am certainly open to improving the readability of the Command builder. I want it to be very simple / readable.

Yes, Node\Builder does build Node objects to be used by the Riak client that make the cluster. The Riak client has a built in random node selector which selects a node on construct and the ability to pick a new node upon connection failure.

I am open to alternatives and looking forward to a PR. We are traveling to Seattle for an all hands meeting this week and I will have some time to review my code with the team. Will update you with whatever design feedback I get.

christophermancini commented 9 years ago

Since it has been some time, I have an update on the path I took for the developer API for the lib. I am preparing to make an initial alpha release of the 2.0 client which will support KV objects and CRDTs.

Here is an example of storing an object without a key in the new lib:

use Basho\Riak;
use Basho\Riak\Node;
use Basho\Riak\Command;

$nodes = (new Node\Builder)
    ->withPort(10018)
    ->buildCluster(['riak1.company.com', 'riak2.company.com', 'riak3.company.com',]);

$riak = new Riak($nodes);

$command = (new Command\Builder\StoreObject($riak))
    ->addObject('some_data')
    ->addBucket('users')
    ->build();

$response = $command->execute($command);
debo commented 9 years ago

@christophermancini what's the relationship between the bucket and the object in the example you posted?

christophermancini commented 9 years ago

Basically there isn't really a relationship, is added to the builder to be used by the command so that Riak stores the object within that Bucket. If you were to store an existing object, you would add an already instantiated object like this:

$command = (new Command\Builder\StoreObject($riak))
    ->withObject($user)
    ->addLocation($user_id, 'users')
    ->build();

$response = $command->execute($command);

The way the code is written, you could create your own classes extending the Command\Builder classes, e.g. StoreUser extends StoreObject, so that all that is needed to be passed is the $user object and your class can fill in the proper Bucket container it is supposed to be stored in.

debo commented 9 years ago

No, extend is bad :)

Back the initial example I'm going to be a bit of a devil's advocate here and I want to say it's not really clear that there is a dependency between the object and that bucket. I would expect more some like:

$command = (new Command\Builder\StoreObject($riak))
    ->buildObject('object')
    ->inBucket('bucket');

$command->execute();

or

$command = (new Command\Builder\StoreObject($riak))
    ->buildBucket('bucket')
    ->withObject('object');

$command->execute();

I think it's clearer that way

christophermancini commented 9 years ago

Ha, well, I made it that way in the event people chose to.

The method names for the builders are absolutely up for debate, I want the builders to be very simple to understand. I actually like your suggestions, what about the following methods.

for passing through the full objects for use by the command:

for passing through string values and having the builder build the objects for use by the command:

You could use either set of methods to build your command.

debo commented 9 years ago

:+1:

I like it this more, it's flexible, it clearly declare what are you asking for, hence, it's readable.

christophermancini commented 9 years ago

Great, I will update the builders. KV objects and CRDTs are currently working in the 2.x branch. A team member here at Basho is going to be helping me wrap up the rewrite, so we should have 2i and search done within the next couple of weeks.

debo commented 9 years ago

Great, I will then scrutinise it ;)

christophermancini commented 9 years ago

It has been committed. I look forward to your scrutiny. :)

Basho-JIRA commented 9 years ago

Closing as the related GitHub issue was Closed.

_[posted via JIRA by Derek Somogyi]_