thobbs / phpcassa

PHP client library for Apache Cassandra
thobbs.github.com/phpcassa
MIT License
248 stars 78 forks source link

ColumnFamily->get_count() always return 100 when count > 100 #87

Closed lisconub closed 12 years ago

lisconub commented 12 years ago

I was used 0.8.a.2 for several month, It's works fine. I am trying to upgrade to 1.0.a.3 today and I find there is something wrong.

codes:

$server = array("127.0.0.1:9160"); $conn = new ConnectionPool("LaunchReport"); $storeline_cf = new ColumnFamily($conn, 'AllLaunchReport'); echo "count:".$storeline_cf->get_count("0");

in 1.0.a.3 , it always shows "count:100".

but in 0.8.a.2, it's able to show correct count ( count: 219).

same data, same configure with cassandra 1.1.1

Anyone know what's going wrong with it?

thobbs commented 12 years ago

get_count() actually takes a ColumnSlice as the first argument. By default, it will use 100 for the count.

The default behavior for get_count() should be to set a higher limit, and I will fix that in this ticket, but for a workaround, you can create a ColumnSlice with a higher limit and pass that in as the first argument.

lisconub commented 12 years ago

This workaround works. Thanks.

code: $slice = new ColumnSlice('','',10000); echo "count:".$allline_cf->get_count('0',$slice);

But I find another problem.

codes: $slice = new ColumnSlice('','',10000); echo "count:".$allline_cf->get_count("426-2012|6|18",$slice);

It shows correct count:276.

cassandraCLI messages: [default@LaunchReport] count StoreLaunchReport[ascii('426-2012|6|18')]; 276 columns

But When I try to fetch larger CF

codes: $slice = new ColumnSlice('','',10000); echo "count:".$allline_cf->get_count("0-2012|6|18",$slice);

It shows error messages:

Fatal error: Uncaught exception 'phpcassa\Connection\MaxRetriesException' with message 'An attempt to execute get_count failed 6 times. The last error was TTransportException:TSocket: timed out reading 4 bytes from localhost:9160' in /data/www/includes/phpcassa10a31/lib/phpcassa/Connection/ConnectionPool.php:281 Stack trace: #0 /data/www/includes/phpcassa10a31/lib/phpcassa/ColumnFamily.php(440): phpcassa\Connection\ConnectionPool->call('get_count', '0-2012|6|18', Object(cassandra\ColumnParent), Object(cassandra\SlicePredicate), 1) #1 /data/www/includes/phpcassa10a31/lib/phpcassa/ColumnFamily.php(435): phpcassa\ColumnFamily->_get_count('0-2012|6|18', Object(cassandra\ColumnParent), Object(cassandra\SlicePredicate), NULL) #2 /data/www/basic.php(33): phpcassa\ColumnFamily->get_count('0-2012|6|18', Object(phpcassa\ColumnSlice)) #3 {main} thrown in /data/www/includes/phpcassa10a31/lib/phpcassa/Connection/ConnectionPool.php on line 281

cassandraCLI messages: [default@LaunchReport] count AllLaunchReport[ascii('0-2012|6|18')]; 10719 columns

nmmmnu commented 12 years ago

I had same issue, but I gave a ColumnSlice with very high limit and worked.

thobbs commented 12 years ago

@lisconub it's just timing out because it takes a while to count 10719 columns. Counting columns is not constant time. The cassandra-cli likely has a higher timeout set.