thobbs / phpcassa

PHP client library for Apache Cassandra
thobbs.github.com/phpcassa
MIT License
248 stars 78 forks source link

UUID serialize #93

Closed lekamm closed 11 years ago

lekamm commented 12 years ago

get_super_column on a CF like this one:

create column family IdTimelines with column_type=Super and key_validation_class=UTF8Type and comparator=UTF8Type and subcomparator=TimeUUIDType and default_validation_class=UTF8Type;

Returns a serialized UUID object: 'O:13:"phpcassa\UUID":8:{s:8:"' . "\0" . '' . "\0" . 'bytes";s:16:"????????{~[??";s:6:"' . "\0" . '' . "\0" . 'hex";N;s:9:"' . "\0" . '' . "\0" . 'string";s:36:"e1e90690-c6e9-11e1-b9f6-b37b7e5bd1e1";s:6:"' . "\0" . '' . "\0" . 'urn";N;s:10:"' . "\0" . '' . "\0" . 'version";N;s:10:"' . "\0" . '' . "\0" . 'variant";N;s:7:"' . "\0" . '' . "\0" . 'node";N;s:7:"' . "\0" . '' . "\0" . 'time";N;}'

Should it be an object or just bytes returned ?

UUIDType.php

public function unpack($data, $handle_serialize=true) {
    $value = UUID::import($data);
    if ($handle_serialize) {
        return serialize($value);
    } else {
        return $value;
    }
}

Furthermore, regarding TimeUUIDType I could figure out that depending on what you perform with timeuuid, the required type changes from UUID to bytes. (UUID for inserts, bytes for slice column range, bytes for super column etc ...)

And at the end, I found a case that cannot work:

        $idTimelinesCF->insert($id, array($timeline => array($recordId => $now)));

$recordId as to be UUID type, but UUID is an object and cannot be considered a valid offset for an array.

So what about treating all timeuuid as bytes and forget about the UUID class ?

My quick fix:

class UUIDType extends CassandraType implements Serialized
{
    public function pack($value, $is_name=true, $slice_end=null, $is_data=false) {      
        if ($is_name && $is_data && !getType($value == 'object'))
            $value = unserialize($value);

        if(getType($value) == 'object')
            return $value->bytes;

        return $value;
    }

    public function unpack($data, $handle_serialize=true) {
        $value = UUID::import($data);
        if ($handle_serialize) {
            return serialize($value);
        } else {
            return $value;
        }
    }
}
thobbs commented 11 years ago

When you're working with UUID column names (or subcolumn names), you really want to use one of the alternate data formats. The root of the problem is that php cannot use arbitrary objects as map keys, so phpcassa serializes objects like UUIDs when necessary if you're using ColumnFamily::DICTIONARY_FORMAT.

I suggest you take a look at the alternate data formats example.