jbellis / YCSB

Yahoo! Cloud Serving Benchmark
http://research.yahoo.com/Web_Information_Management/YCSB
Other
20 stars 20 forks source link

Add CQL prepared statements #2

Closed jbellis closed 10 years ago

lyubent commented 10 years ago

Post #3 The change for read and scan from Set to a single String was simple enough, but a HashMap of fields and their respective values is used for Insert and Update. This can be changed to a list of values as one param and a string for the fields where again a null is used for all fields or a single field is supplied. The issue that can be caused here is that values are paired to a field, if the pairing between field and value doesn't matter (which as far as i can see it does not) this approach is fine, however if fields need to be matched to a value generated in the core workload it would be better to check the size of the hashmap in the {{CasssandraCQLClient}}.

jbellis commented 10 years ago

Why do we need to change Insert and Update? On Jul 8, 2014 8:45 PM, "Lyuben Todorov" notifications@github.com wrote:

Post #3 https://github.com/jbellis/YCSB/pull/3 The change for read and scan from Set to a single String was simple enough, but a HashMap of fields and their respective values is used for Insert and Update. This can be changed to a list of values as one param and a string for the fields where again a null is used for all fields or a single field is supplied. The issue that can be caused here is that values are paired to a field, if the pairing between field and value doesn't matter (which as far as i can see it does not) this approach is fine, however if fields need to be matched to a value generated in the core workload it would be better to check the size of the hashmap in the {{CasssandraCQLClient}}.

— Reply to this email directly or view it on GitHub https://github.com/jbellis/YCSB/issues/2#issuecomment-48420818.

lyubent commented 10 years ago

and similarly for writeallfields. Then read/scan/insert/update can take a single String, with the same semantics that null = all fields instead.

Maybe I misinterpreted something here, but I though that there was an implication that insert and update should be updated to use a single string for the fields that need to be updated.

lyubent commented 10 years ago

Also on a sidenote, we can prepare all the statements that we'll require by gathering the table, fieldprefix and fieldcount properties by for example using getProperties().getProperty(CoreWorkload.TABLENAME_PROPERTY, CoreWorkload.TABLENAME_PROPERTY_DEFAULT) to get the table, then looping through each field and preparing the statement:

    // insert statement example:
for (int i = 0; i < fieldCount; i++)
{
    Insert is = QueryBuilder.insertInto(table);
    is.value(YCSB_KEY, QueryBuilder.bindMarker());
    is.value(fieldPrefix + i, QueryBuilder.bindMarker());
}
session.prepare(is); // then we can store and retrieve this ps when required.
jbellis commented 10 years ago

On the update side I'm starting to think we need to split it into two methods: updateOne and updateAll, with the former taking key/column/value and the latter taking key/Map.

insert doesn't need to change since it always takes all values.

We should probably refactor reads to match (readOne / readAll) rather than special casing column==null.

lyubent commented 10 years ago

I see how breaking up the read into readOne and readAll makes sense, but our updates are just inserts, so if it makes sense to change the update, then insert should surely follow, otherwise we're back to creating and passing a hashmap with a single <K,V> element to the insert.

jbellis commented 10 years ago

Insert always includes all columns, it doesn't consult writeallfields (which makes sense).

lyubent commented 10 years ago

I updated the driver version to 2.0.3 but had to increment the default connections to at-least 2 otherwise this error appears:

com.yahoo.ycsb.DBException: java.lang.IllegalArgumentException: Core connections for LOCAL hosts must be less than max (2 > 1)
    at com.yahoo.ycsb.db.CassandraCQLClient.init(CassandraCQLClient.java:148)
    at com.yahoo.ycsb.ClientThread.run(Client.java:319)
Caused by: java.lang.IllegalArgumentException: Core connections for LOCAL hosts must be less than max (2 > 1)
    at com.datastax.driver.core.PoolingOptions.checkConnectionsPerHostOrder(PoolingOptions.java:244)
    at com.datastax.driver.core.PoolingOptions.setMaxConnectionsPerHost(PoolingOptions.java:215)
    at com.yahoo.ycsb.db.CassandraCQLClient.init(CassandraCQLClient.java:128)
    ... 1 more

Commit 16e7e8b47567902ca7b46d9c1744ea660ac221cf modifies signatures of:

jbellis commented 10 years ago

LGTM.

lyubent commented 10 years ago

Added prepared statemets for CRUD and Scan, the only downside is the debug info isn't useful:

INSERT INTO usertable(y_id,field0,field1,field2,field3,field4,field5,field6,field7,field8,field9) VALUES (?,?,?,?,?,?,?,?,?,?,?);

A solution is to use String.replace(...) and place the values, and since its in debug mode, it might be a worthy tradeoff.

jbellis commented 10 years ago

I'd just log the Map of values with the insert statement. No sense getting fancier for something that doesn't get used often.

lyubent commented 10 years ago

I'm having trouble with aerospike. I think we got the client from here but maven fails to find the dependency. There is an open issue to update to aerospike 3, I suggest we merge the diff into our own branch.

jbellis commented 10 years ago

WFM.

On Mon, Jul 14, 2014 at 12:06 PM, Lyuben Todorov notifications@github.com wrote:

I'm having trouble with aerospike. I think we got the client from here https://github.com/thumbtack-technology/ycsb but maven fails to find the dependency. There is an open issue https://github.com/thumbtack-technology/ycsb/pull/4 to update to aerospike 3, I suggest we merge the diff into our own branch.

— Reply to this email directly or view it on GitHub https://github.com/jbellis/YCSB/issues/2#issuecomment-48927317.

Jonathan Ellis Project Chair, Apache Cassandra co-founder, http://www.datastax.com @spyced