mkjellman / perlcassa

a Perl client for Apache Cassandra
Apache License 2.0
20 stars 12 forks source link

CQL3 exec(): parameter hash doesn't work for int, uuid, timeuuid type columns #28

Closed MartinHerren closed 10 years ago

MartinHerren commented 11 years ago

CQL3 exec(): parameter hash doesn't work for int, uuid and timeuuid type columns as these types don't accept to be quoted. Possibly affects other types as well. Currenty these types must be inlined into the query.

Example of code showing the issue:

#!/usr/bin/perl

use perlcassa;
use UUID::Tiny;
use Try::Tiny;

my $cassy = new perlcassa(
    'keyspace' => "testspace",
    'seed_nodes' => ['localhost'],
    'write_consistency_level' => Cassandra::ConsistencyLevel::QUORUM,
    'read_consistency_level' => Cassandra::ConsistencyLevel::QUORUM,
    'port' => '9160'
);

try {
    $cassy->exec(
        "CREATE TABLE testtable (key uuid," .
        " ivalue int," .
        " tvalue text," .
        " PRIMARY KEY(key))"
    );
} catch {
    print "Probably table existed before\n";
};

my $key = create_UUID_as_string();
try {
    $cassy->exec(
        "INSERT INTO testtable (key, ivalue, tvalue)" .
        " VALUES ($key, 123, 'teststring')"
    );
    print "Test 1 succeeded\n";
} catch {
    print "Test 1 failed: $_\n";
};

$key = create_UUID_as_string();
try {
    $cassy->exec(
        "INSERT INTO testtable (key, ivalue, tvalue)" .
        " VALUES (:keyvalue, 234, 'teststring')",
        {
            keyvalue => $key
        }
    );
    print "Test 2 succeeded\n";
} catch {
    print "Test 2 failed: $_\n";
};

$key = create_UUID_as_string();
try {
    $cassy->exec(
        "INSERT INTO testtable (key, ivalue, tvalue)" .
        " VALUES ($key, :ivalue, 'teststring')",
        {
            ivalue => 345,
        }
    );
    print "Test 3 succeeded\n";
} catch {
    print "Test 3 failed: $_\n";
};

$key = create_UUID_as_string();
try {
    $cassy->exec(
        "INSERT INTO testtable (key, ivalue, tvalue)" .
        " VALUES ($key, 456, :tvalue)",
        {
            tvalue => 'teststring2'
        }
    );
    print "Test 4 succeeded\n";
} catch {
    print "Test 4 failed: $_\n";
};

EDIT: for perl highlighting

mkjellman commented 11 years ago

thanks, we'll work on addressing this.

Lanzaa commented 11 years ago

This is actually difficult for the UUID case. I see two choices.

  1. request the column types from cassandra before preparing an exec statement
  2. a way to distinguish between strings and UUIDs and numbers.
MartinHerren commented 11 years ago

UUIDs strings have a pretty 'simple' structure: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx where each x is a hex nibble. So regexping them would be easy.

But the current query preparation code must be changed to work in several steps:

I guess that would have a big performance impact.

There would also be the question what happens when you want to put a number into a string?

Lanzaa commented 11 years ago

I'm not sure what you mean by part 3, check the type by the format (int, uuid, string).

Lanzaa commented 11 years ago

I think the real solution is to redo the cql exec function to send a prepare_cql3_query instead of just trying to quote things ourselves.

MartinHerren commented 11 years ago

@Lanzaa by part 3 i mean a naive approach of putting a regexp on the value to determine whether the field is an int, uuid or a text. This would need additional workarounds to make it possible to put a pure numerical value into a text column. Or even enable to put a uuid formated string into a text column.

I agree that separate prepare/exec is probably the best way to go, but i guess that won't magically solve all issues by itself ? Or does Thrive already propose a solution ?

I have no problem accepting (at least for some time) having to inline uuids as at least for me they are either generated on-the-fly by the code using them, or retrieved by a previous query. But probably other people will retrieve them through UIs or APIs making them a target for CQL injections.

But as a first targed it would be good to fix 'int' fields to be able to use them properly through param (or through prepare/exec) while still making them safe from injections.

Other types must still be evaluated.

Lanzaa commented 11 years ago

I will try to have something working tomorrow.

@MartinHerren By the way. Thank you very much for bringing up this issue. I had not noticed the new CQL 3.0.2 changes yet.

Lanzaa commented 11 years ago

If you would like to take a look at the code I have so far: https://github.com/Lanzaa/perlcassa/tree/cql-prepared-statements-02

I just need to add more test cases and fix a bug with large numbers in varint columns.

MartinHerren commented 11 years ago

Thanks, i'll try to take a look/test tonight (european time) or tomorrow morning. Won't have much time before Monday.

MartinHerren commented 11 years ago

@Lanzaa : Checked out your branch. Didn't take a close look on the code yet (thanks to fix the uuid unpack and adding test scenarios btw), but some quick testing shows it works in the scenarios i reported.

Thanks !

Lanzaa commented 10 years ago

I think this issue should have been closed a long time ago.