Closed jbellis closed 10 years ago
Why do we need to change Insert and Update? On Jul 8, 2014 8:45 PM, "Lyuben Todorov" notifications@github.com wrote:
Post #3 https://github.com/jbellis/YCSB/pull/3 The change for read and scan from Set to a single String was simple enough, but a HashMap of fields and their respective values is used for Insert and Update. This can be changed to a list of values as one param and a string for the fields where again a null is used for all fields or a single field is supplied. The issue that can be caused here is that values are paired to a field, if the pairing between field and value doesn't matter (which as far as i can see it does not) this approach is fine, however if fields need to be matched to a value generated in the core workload it would be better to check the size of the hashmap in the {{CasssandraCQLClient}}.
— Reply to this email directly or view it on GitHub https://github.com/jbellis/YCSB/issues/2#issuecomment-48420818.
and similarly for writeallfields. Then read/scan/insert/update can take a single String, with the same semantics that null = all fields instead.
Maybe I misinterpreted something here, but I though that there was an implication that insert and update should be updated to use a single string for the fields that need to be updated.
Also on a sidenote, we can prepare all the statements that we'll require by gathering the table, fieldprefix and fieldcount properties by for example using getProperties().getProperty(CoreWorkload.TABLENAME_PROPERTY, CoreWorkload.TABLENAME_PROPERTY_DEFAULT)
to get the table, then looping through each field and preparing the statement:
// insert statement example:
for (int i = 0; i < fieldCount; i++)
{
Insert is = QueryBuilder.insertInto(table);
is.value(YCSB_KEY, QueryBuilder.bindMarker());
is.value(fieldPrefix + i, QueryBuilder.bindMarker());
}
session.prepare(is); // then we can store and retrieve this ps when required.
On the update side I'm starting to think we need to split it into two methods: updateOne and updateAll, with the former taking key/column/value and the latter taking key/Map.
insert doesn't need to change since it always takes all values.
We should probably refactor reads to match (readOne / readAll) rather than special casing column==null.
I see how breaking up the read into readOne and readAll makes sense, but our updates are just inserts, so if it makes sense to change the update, then insert should surely follow, otherwise we're back to creating and passing a hashmap with a single <K,V> element to the insert.
Insert always includes all columns, it doesn't consult writeallfields (which makes sense).
I updated the driver version to 2.0.3 but had to increment the default connections to at-least 2 otherwise this error appears:
com.yahoo.ycsb.DBException: java.lang.IllegalArgumentException: Core connections for LOCAL hosts must be less than max (2 > 1)
at com.yahoo.ycsb.db.CassandraCQLClient.init(CassandraCQLClient.java:148)
at com.yahoo.ycsb.ClientThread.run(Client.java:319)
Caused by: java.lang.IllegalArgumentException: Core connections for LOCAL hosts must be less than max (2 > 1)
at com.datastax.driver.core.PoolingOptions.checkConnectionsPerHostOrder(PoolingOptions.java:244)
at com.datastax.driver.core.PoolingOptions.setMaxConnectionsPerHost(PoolingOptions.java:215)
at com.yahoo.ycsb.db.CassandraCQLClient.init(CassandraCQLClient.java:128)
... 1 more
Commit 16e7e8b47567902ca7b46d9c1744ea660ac221cf modifies signatures of:
scan
to scanOne
which takes a String fora field and scanAll
(no longer requires special casing null) read
to readOne
which takes a String fora field and readAll
(no longer requires special casing null)updat
to updateOne
which replaces the hashmap of values for a String field & ByteIterator value, and updateAll
which uses the hashmap of values.LGTM.
Added prepared statemets for CRUD and Scan, the only downside is the debug info isn't useful:
INSERT INTO usertable(y_id,field0,field1,field2,field3,field4,field5,field6,field7,field8,field9) VALUES (?,?,?,?,?,?,?,?,?,?,?);
A solution is to use String.replace(...) and place the values, and since its in debug mode, it might be a worthy tradeoff.
I'd just log the Map of values with the insert statement. No sense getting fancier for something that doesn't get used often.
I'm having trouble with aerospike. I think we got the client from here but maven fails to find the dependency. There is an open issue to update to aerospike 3, I suggest we merge the diff into our own branch.
WFM.
On Mon, Jul 14, 2014 at 12:06 PM, Lyuben Todorov notifications@github.com wrote:
I'm having trouble with aerospike. I think we got the client from here https://github.com/thumbtack-technology/ycsb but maven fails to find the dependency. There is an open issue https://github.com/thumbtack-technology/ycsb/pull/4 to update to aerospike 3, I suggest we merge the diff into our own branch.
— Reply to this email directly or view it on GitHub https://github.com/jbellis/YCSB/issues/2#issuecomment-48927317.
Jonathan Ellis Project Chair, Apache Cassandra co-founder, http://www.datastax.com @spyced
Post #3 The change for read and scan from Set to a single String was simple enough, but a HashMap of fields and their respective values is used for Insert and Update. This can be changed to a list of values as one param and a string for the fields where again a null is used for all fields or a single field is supplied. The issue that can be caused here is that values are paired to a field, if the pairing between field and value doesn't matter (which as far as i can see it does not) this approach is fine, however if fields need to be matched to a value generated in the core workload it would be better to check the size of the hashmap in the {{CasssandraCQLClient}}.