mjpearson / Pandra

Cassandra abstraction layer and keyspace scaffolder for PHP developers --- ABANDONED.
GNU Lesser General Public License v3.0
93 stars 11 forks source link

TYPE_LONG: (Possible) Fix for Packing to 8bytes long (big endian) binary #48

Open loretoparisi opened 13 years ago

loretoparisi commented 13 years ago

It seems that using CF with type long is not working at all.

I defined this CF:

class RollupCheckpoint extends PandraColumnFamily {

// keyspace in storage.conf
var $keySpace = 'CheckPoints';

// Column name
var $columnFamilyName = 'RollupCheckpoint';

public function init() {
    $this->setKeySpace($this->keySpace); // keyspace
    $this->setName($this->columnFamilyName); // name
    $this->setType(PandraColumnFamily::TYPE_LONG);
}

}

Then I'm inserting new column this way:

            $rollupObj = new RollupCheckpoint();
            $rollupObj->setKeyID( self::pack_longtype($rollupTS) );
    $info = array(
        't'             => $rollupTS,
        'checkpoint'    => $checkpointTS,
        'last-checkpoint'   => $lastCheckpointTS,
    );

    foreach($info as $name => $value) {
        $rollupObj->addColumn($name)->setValue($value); // add column to CF
    }

          $rollupObj->save();

Where two functions pack_longtype and unpack_longtype are from Cassandra FAQ:

http://wiki.apache.org/cassandra/FAQ#a_long_is_exactly_8_bytes

Pandra is responding:

 Warning: pack(): Type N: too few arguments in /Library/WebServer/Documents/logger /phplib/standalone/Logger/lib/pandra/lib/ColumnContainer.class.php on line 485

So, I modified the function this way:

   protected function typeConvert($columnName, $toFmt) {
(...)
} else if ($this->_containerType == self::TYPE_LONG) {
        $columnName = UUID::isBinary($columnName) ?
                        /*unpack('NN', $columnName) :
                        pack('NN', $columnName);*/
                        self::unpack_longtype($columnName) :
                        self::pack_longtype($columnName);

    }

No insert were made in the CF, before that fix (multiline commented code).

After the fix, CF stats then were:

Column Family: RollupCheckpoint SSTable count: 1 Space used (live): 381 Space used (total): 381 Memtable Columns Count: 3 Memtable Data Size: 99 Memtable Switch Count: 1 Read Count: 5 Read Latency: 0,059 ms. Write Count: 6 Write Latency: 0,013 ms. Pending Tasks: 0 Key cache capacity: 128 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0

So some insert were made in it !

The row converted to JSON was:

{ row : {"t":1291078926,"checkpoint":1290606987,"last-checkpoint":1279022588}}

But when trying to read it:

   $rollupObj = new MXMRollupCheckpoint();
   $rollupObj->setKeyID( self::pack_longtype($rollupTS) );
   $rollupObj->load();

   Logger::getInstance()->debug( '{ row:'.$rollupObj->toJSON(True).'}' );

I got

  { row : ["1279022588"]}

As you can see it lacks of NS and CF names, as required by the

   $rollupObj->toJSON(True)

but there's something inside of it.

So, what's happening with TYPE_LONG?

mjpearson commented 13 years ago

Thanks this should be fixed for you in the latest commit - https://github.com/mjpearson/Pandra/commit/55eed06ab249a381c20035e7f3d4542fda2913ae. The toJSON/toArray method was flagging the keyspace wrapper in the wrong place, this looks like a bug which has been there for a while. Thanks for the pick up!

-michael

loretoparisi commented 13 years ago

I applied the fix merging the diff, Now saving the CF is ok, but when dumping out to JSON it did not convert the byte value, so:

            $rollupObj = new RollupCheckpoint();
    $rollupObj->setKeyID( 'rollup-checkpoint' );
    $rollupObj->addColumn( self::pack_longtype($rollupTS)  )->setValue($checkpointTS);
            $rollupObj->save();
    Logger::getInstance()->debug( '{ r:'.$rollupObj->toJSON(True).'}' );

prints:

       { row:{"CheckPoints":{"RollupCheckpoint":{"rollup-checkpoint":{null:1290606987}}}}}

But when loading, we have the key:

            $rollupObj = new MXMRollupCheckpoint();
    $rollupObj->setKeyID( 'rollup-checkpoint' );
    $rollupObj->load();
            Logger::getInstance()->debug( '{ r:'.$rollupObj->toJSON(True).'}' );

prints out:

             { row:{"CheckPoints":{"RollupCheckpoint":{"rollup-checkpoint":{"1291139029":"1290606987"}}}}}

Maybe something else in the toJSON method ?

Thanks, LP

loretoparisi commented 13 years ago

Finally, if someone is using the previous releases of Pandra, to fix the TYPE_LONG, the function PandraColumnContainer#typeConvert is the following:

protected function typeConvert($columnName, $toFmt) {
    if (($this->_containerType == self::TYPE_UUID)  ) {

        $bin = UUID::isBinary($columnName);

        // Save accidental double-conversions on binaries
        if (($bin && $toFmt == UUID::UUID_BIN) ||
                (!$bin && $toFmt == UUID::UUID_STR)) {
            return $columnName;
        } elseif (!$bin && !UUID::validUUID($columnName)) {
            throw new RuntimeException('Column Name ('.$columnName.') cannot be converted');
        }

        if ($toFmt == UUID::UUID_BIN) {
            return UUID::toBin($columnName);
        } elseif ($toFmt == UUID::UUID_STR) {
            return UUID::toStr($columnName);
        }
    } else if ($this->_containerType == self::TYPE_LONG) {
        $columnName = UUID::isBinary($columnName) ?
                        $columnName = self::unpack_longtype($columnName) :
                        $columnName = self::pack_longtype($columnName);
    }

    return $columnName;
}

where two methods for pack and unpack binaries are stated in the current Pandra trunk commit.