tarantool / tarantool-php

PECL PHP driver for Tarantool
http://tarantool.org/
Other
86 stars 24 forks source link

Straightforward distinguishing between arrays and maps #89

Open zaglex opened 8 years ago

zaglex commented 8 years ago

Conversion from php-array to lua-table may be little bit tricky.

Consider the case:

$data[0] = 'a';
$data[1] = 'b';
$data[2] = 'c';
$tarantool->call('myfunc', [$data]);

On lua side you will se such a table:

data[0] == nil
data[1] == 'a'
data[2] == 'b'
data[3] == 'c'

But then you do this on php-side:

unset($data[1]);
$tarantool->call('myfunc', [$data]);

And on lua side you will se this:

data[0] == 'a'
data[1] == 'nil'
data[2] == 'c'

I understand why it happens, but anyway this may be confusing and may cause unexpected errors.

Is there any way to force tarantool client to trait array either as "real array" or as "map"? Would be great to have special classes like MessagePackArray and MessagePackMap, or even LuaArray and LuaTable for this purpose.

rybakit commented 8 years ago

From what i see, [0 => 'a', 1 => 'b', 2 => 'c'] is packed to MP array and [0 => 'a', 2 => 'c'] to MP map and on Lua side these structures are converted to Lua array and Lua table accordingly. Why it's confusing and what result you expect to see? Could you please elaborate on this a bit further?

zaglex commented 8 years ago

Imagine space where tuples consist of 2 fields: NUM primary key and TABLE "payload". Payload is a variable-length map: { fieldId => fieldValue}, where fieldId is integer >= 0. When you pass from php to lua something like this:

$fields = [];
$fields[0] = 'field0Val';
$fields[5] = 'field5Val';
$field[7] = 'field7Val';
$fields = [];
$fields[0] = 'field0Val';
$fields[1] = 'field1Val';
$field[2] = 'field2Val';

Of course you can disallow to use 0 as field identifier on the app level, so it doesn't look like a critical issue, or may be it is not an issue at all - just bad choice of field identifier in someone's app)

rybakit commented 8 years ago

Ah, I see. I have the similar issue with the msgpack php extension ;) In my own msgpack implementation I use MP ext type to pack arbitrary types (see https://github.com/rybakit/msgpack.php#custom-types) but, afaik, Tarantool's msgpack doesn't support it yet.

akalend commented 8 years ago

I make the issue https://github.com/rtsisyk/msgpuck/issues/10 at 3 month ago. The part code of the implenentation MP ext type in the hhvm_msgpack: https://github.com/akalend/hhvm-msgpack/blob/hhvm-v-3.12/msgpuck.h

bigbes commented 8 years ago

@rybakit As I get it - there's no connection between MP_EXT and this issue. @zaglex want a way to enforce something to be Map or Array. That's why some problems with indexing are happened.

bigbes commented 8 years ago

@zaglex what'll happened if you'll do something like this?

$data[1] = NULL;
$tarantool->call('myfunc', [$data]);
zaglex commented 8 years ago

Yes, @bigbes, you are right, I'm speaking about enforcing Map or Array. As for your example, when I do this:

$data[1] = NULL;
$tarantool->call('myfunc', [$data]);

I get this:

data[0]  - nil
data[1]  - cdata<void *>: NULL
zaglex commented 8 years ago

Ah, you mean that this solves problem with unset from my first example. Yes, it looks like a workaround.

rybakit commented 8 years ago

@bigbes I don't see any other sane way to force php array to be packed as MP map. And with ext it's quite easy to implement. It's already possible with pure client, so I can do something like:

// register your own map transformer 
$coll = new Collection([new MapTransformer(5)]);
$packer->setTransformers($coll);

...

// send a map object directly, MapTransformer will take care about the rest
$client->call('myfunc', [new Map(['a', 'b', 'c'])]); // <- will be packed as MP map

The only problem atm is that on Lua side you will not be able to unpack this extension.

bigbes commented 8 years ago

@rybakit it was said - you're creating MessagePackMap/MessagePackArray instance (that inherits ArrayObject) and use it like array. Maybe you can create object, that will store Array. Every other thing will happen in connector (type checking and applying right encoding procedure)

rybakit commented 8 years ago

@bigbes Of course, you can create the classes and do some checks it in the connector, but:

1) The connector should know about those classes to apply different packing/unpacking rules based on the class name (or other marker, e.g. interface).

2) Note, that the problem is a bit wider than only map/array distinguishing. The same issue is with MP str/bin format family. Sometimes you need to pack/unpack php string as MP binary and vise versa.

I'm thinking out loud now, but you may consider to add the Tarantool\Ext class:

$tnt->call('myfunct', new Ext(1, ['a', 'b', 'c'])); // we know that 1 means "array"
$tnt->call('myfunct', new Ext(2, ['a', 'b', 'c'])); // and 2 means "map" in our app

And Tarantool can even have predefined ext types for common cases:

Tarantool::EXT_ARRAY = 1
Tarantool::EXT_MAP = 2
Tarantool::EXT_UTF8 = 3
Tarantool::EXT_BIN = 4
...
(we can reserve 0-16 values for future types)

So instead of creating custom MessagePackMap / MessagePackArray / MessagePackUtf8 / MessagePackBin classes you will have one generic Ext class which will do the same. And you can decide if you want to pack/unpack those structures directly in the connector or forward them to Tarantool as MP ext and let him do packing/unpacking (but then, Tarantool should support MP ext out of the box).

bigbes commented 8 years ago

1) It's what we discuss here - we need those classes or not 2) Yes we can, but that needs changing of server - and that's more fundamental task, than just rewrite part of connector. It's overingeneering.

rybakit commented 8 years ago

There is a related issue regarding bin/utf8 distinguishing: https://github.com/msgpack/msgpack-php/issues/89.

rybakit commented 6 years ago

For the record, here is an example of what I was talking above regarding the custom Map objects: https://github.com/rybakit/msgpack.php/blob/64adceae711932b7c55f658884c110f5c2e18e50/examples/map.php#L13-L14. The only difference is that the new implementation doesn't use the MessagePack extension, but directly packs the wrapped array into MP map. Though I'm not sure how this functionality can be ported to the pecl connector.